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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. BACKGROUND OF THE INVENTION 

5 1.1 TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by 
such polynucleotides, along with uses for these polynucleotides and proteins, for example 
in therapeutic, diagnostic and research methods. 

10 1.2 BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, 
such as lymphokines, interferons, CSFs, chemokines, and interleukins) has matured 
rapidly over the past decade. The now routine hybridization cloning and expression 
cloning techniques clone novel polynucleotides "directly" in the sense that they rely on 

15 information directly related to the discovered protein (i.e., partial DNA/amino acid 

sequence of the protein in the case of hybridization cloning; activity of the protein in the 
case of expression cloning). More recent "indirect" cloning techniques such as signal 
sequence cloning, which isolates DNA sequences based on the presence of a now 
well-recognized secretory leader sequence motif, as well as various PCR-based or low 

20 stringency hybridization-based cloning techniques, have advanced the state of the art by 
making available large numbers of DNA/amino acid sequences for proteins that are 
known to have biological activity, for example, by virtue of their secreted nature in the 
case of leader sequence cloning, by virtue of their cell or tissue source in the case of 
PCR-based techniques, or by virtue of structural similarity to other genes of known 

25 biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications 
in, for example, diagnostics, forensics, gene mapping; identification of mutations 
responsible for genetic disorders or other traits, to assess biodiversity, and to produce 
many other types of data and products dependent on DNA and amino acid sequences. 

30 

2. SUMMARY OF THE INVENTION 



1 



wo 02/074961 



PCT/US02/05109 



The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA 
molecules, cloned genes or degenerate variants thereof, especially naturally occurring 
variants such as allelic variants, antisense polynucleotide molecules, and antibodies that 

5 specifically recognize one or more epitopes present on such polypeptides, as well as 
hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including 
expression vectors, containing the polynucleotides of the invention, cells genetically 
engineered to contain such polynucleotides and cells genetically engineered to express such 

10 polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic 
acid sequence assembled from expressed sequence tags (ESTs) isolated mainly by 
sequencing by hybridization (SBH), and in some cases, sequences obtained from one or 
more public databases. The invention relates also to the proteins encoded by such 

I S polynucleotides, along with therapeutic, diagnostic and research utilities for these 

polynucleotides and proteins. These nucleic acid sequences are designated as SEQ E> NO: 
I - 526. The polypeptide sequences are designated SEQ ID NOS: 527 - 1052. The nucleic 
acids and polypeptides are provided in the Sequence Listing. In the nucleic acids provided 
in the Sequence Listing, A is adenine; C is cytosine; G is guanine; T is thymine; and N is 

20 any of the four bases. In the amino acids provided in the Sequence Listing, * corresponds to 
the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid 
sequences that hybridize to the complement of SEQ ID NO: 1 - 526 under stringent 
hybridization conditions; nucleic acid sequences which are allelic variants or species 

25 homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences 
that encode a peptide comprising a specific domain or truncation of the peptides encoded by 
SEQ ID NO: 527 - 1052. A polynucleotide comprising a nucleotide sequence having at 
least 90% identity to an identifying sequence of SEQ ID NO: 1 - 526 or a degenerate variant 
or fragment thereof The identifying sequence can be 100 base pairs in length. 

30 The nucleic acid sequences of the present mvention also mclude the sequence 

information from the nucleic acid sequences of SEQ ID NO: I - 526. The sequence 
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infonnation can be a segment of any one of SEQ ID NO: 1 - 526 that uniquely identifies or 
represents the sequence information of SEQ ID NO: 1 - 526. 

A collection as used in this application can be a collection of only one 
polynucleotide. The collection of sequence information or identifying information of each 

S sequence can be provided on a nucleic acid array. In one embodiment, segments of 

sequence infomiation are provided on a nucleic acid array to detect the polynucleotide that 
contains the segment. The array can be designed to detect full-match or mismatch to the 
polynucleotide that contains the segment. The collection can also be provided in a 
computer-readable format. 

1 0 This invention also includes the reverse or direct complement of any of the nucleic 

acid sequences recited above; cloning or expression vectors containing the nucleic acid 
sequences; and host cells or organisms transformed with these expression vectors. Nucleic 
acid sequences (or their reverse or direct complements) according to the invention have 
numerous applications in a variety of techniques known to those skilled in the art of 

1 5 molecular biology, such as use as hybridization probes, use as primers for PCR, use in an 
array, use in computer-readable media, use in sequencing full-length genes, use for 
chromosome and gene mapping, use in the recombinant production of protein, and use in 
the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-526 or 

20 novel segments or parts of the nucleic acids of the invention are used as primers in 

expression assays that are well known in the art. In a particularly preferred embodiment, the 
nucleic acid sequences of SEQ ID NO: 1-526 or novel segments or parts of the nucleic acids 
provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Volbrath et al.. Science 258:52-59 (1992), as expressed sequence 

25 tags for physical mappmg of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1- 
526; a polynucleotide comprising any of the full I^gth protein coding sequences of SEQ ID 
NO: 1-526; and a polynucleotide comprising any of the nucleotide sequences of the mature 

30 protein coding sequences of SEQ ED NO: 1-526. The polynucleotides of the present 
invention also include, but are not limited to, a polynucleotide that hybridizes under 
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Stringent hybridization conditions to (a) the complement of any one of the nucleotide 
sequences set forth in SEQ ID NO: 1-526; (b) a nucleotide sequence encoding any one of 
the amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an 
allelic variant of any polynucleotides recited above; (d) a polynucleotide which encodes a 

5 species homolog (e.g. orthologs) of any of the proteins recited above; or (e) a 

polynucleotide that encodes a polypeptide comprising a specific domain or truncation of any 
of the polypeptides comprising an amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising any of the amino acid sequences set forth in the Sequence Listing; 

10 or the corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides with biological activity that are encoded by (a) any of the polynucleotides 
having a nucleotide sequence set forth in SEQ ID NO: 1-526; or (b) polynucleotides that 
hybridize to the complement of the polynucleotides of (a) under stringent hybridization 
conditions. Biologically or immunologically active variants of any of the polypeptide 

1 5 sequences in the Sequence Listing, and "substantial equivalents" thereof (e.g., with at least 
about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity) 
that preferably retain biological activity are also contemplated. The polypeptides of the 
invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

20 The invention also provides compositions comprising a polypeptide of the 

invention. Polypeptide compositions of the invention may further comprise an acceptable 
carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a 
polynucleotide of the invention. 

25 The invention also relates to methods for producing a polypeptide of the invention 

comprising growing a culture of the host cells of the invention in a suitable culture 
medium under conditions permitting expression of the desired polypeptide, and purifying 
the polypeptide from the culture or from the host cells. Preferred embodiments include 
those in which the protein produced by such process is a mature form of the protein. 

30 Polynucleotides according to the invention have numerous appUcations in a 

variety of techniques known to those skilled in the art of molecular biology. These 
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techniques include use as hybridization probes, use as oligomers, or primers, for PCR, 
use for chromosome and gene mapping, use in the recombinant production of protein, 
and use in generation of anti-sense DNA or RNA, their chemical analogs and the like. 
For example, when the expression of an mRNA is largely restricted to a particular cell or 
5 tissue type, polynucleotides of the invention can be used as hybridization probes to detect 
the presence of the particular cell or tissue mRNA in a sample using, e.g., in situ 
hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 

10 exemplified by VoUrath et al.. Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of 
conventional procedures and methods that are currently applied to other proteins. For 
example, a polypeptide of the invention can be used to generate an antibody that 

1 5 specifically binds the polypeptide. Such antibodies, particularly monoclonal antibodies, 
are useful for detecting or quantitating the polypeptide in tissue. The polypeptides of the 
invention can also be used as molecular weight markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical 
condition which comprises the step of administering to a mammalian subject a 

20 therapeutically effective amount of a composition comprising a polypeptide of the 
present invention and a pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be 
utilized, for example, in methods for the prevention and/or treatment of disorders 
involving aberrant protein expression or biological activity. 

25 The present invention further relates to methods for detecting the presence of the 

polynucleotides or polypeptides of the invention in a sample. Such methods can, for 
example, be utilized as part of prognostic and diagnostic evaluation of disorders as 
recited herein and for the identification of subjects exhibiting a predisposition to such 
conditions. The invention provides a method for detecting the polynucleotides of the 

30 invention in a sample, comprising contacting the sample with a compound that binds to 
and forms a complex with the polynucleotide of interest for a period sufficient to form 
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the complex and under conditions sufficient to form a complex and detecting the complex 
such that if a complex is detected, the polynucleotide of interest is detected. The 
invention also provides a method for detecting the polypeptides of the invention in a 
sample comprising contacting the sample with a compound that binds to and forms a 

5 complex with the polypeptide under conditions and for a period sufficient to form the 
complex and detecting the formation of the complex such that if a complex is formed, the 
polypeptide is detected. 

The invention also provides kits comprising polynucleotide probes and/or 
monoclonal antibodies, and optionally quantitative standards, for carrying out methods of 

10 the invention. Furthermore, the invention provides methods for evaluating the efficacy of 
drugs, and monitoring the progress of patients, involved in clinical trials for the treatment 
of disorders as recited above. 

The invention also provides methods for the identification of compounds that 
modulate (i.e., increase or decrease) the expression or activity of the polynucleotides 

15 and/or polypeptides of the invention. Such methods can be utilized, for example, for the 
identification of compounds that can ameliorate symptoms of disorders as recited herein. 
Such methods can include, but are not limited to, assays for identifying compounds and 
other substances that interact with (e.g., bind to) the polypeptides of the invention. The 
invention provides a method for identifying a compound that binds to the polypeptides of 

20 the invention comprising contacting the compound with a polypeptide of the invention in 
a cell for a time sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and detecting the 
complex by detecting the reporter gene sequence expression such that if expression of the 
reporter gene is detected the compound that binds to a polypeptide of the invention is 

25 identified. 

The methods of the invention also provides methods for treatment which involve 
the administration of the polynucleotides or polypeptides of the invention to individuals 
exhibiting symptoms or tendencies. In addition, the invention encompasses methods for 
treating diseases or disorders as recited herein comprising administering compounds and 
30 other substances that modulate the overall activity of the target gene products. 
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Compounds and other substances can effect such modulation either on the level of target 
gene/protein expression or target protein activity. 

The polypeptides of the present invention and the polynucleotides encoding them 
are also useful for the same functions known to one of skill in the art as the polypeptides 
5 and polynucleotides to which they have homology (set forth in Table 2); for which they 
have a signature region (as set forth in Table 3); or for which they have homology to a 
gene family (as set forth in Table 4). If no homology is set forth for a sequence, then the 
polypeptides and polynucleotides of the present invention are usefiil for a variety of 
applications, as described herein, including use in arrays for detection. 

10 

3. DETAILED DESCRIPTION OF THE INVENTION 
3.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular 
1 5 forms "a", "an" and "the" include plural references unless the context clearly dictates 
otherwise. 

The term "active" refers to those forms of the polypeptide which retain the 
biologic and/or immunologic activities of any naturally occurring polypeptide. According 
to the invention, the terms "biologically active" or "biological activity" refer to a protein 

20 or peptide having structural, regulatory or biochemical functions of a naturally occurring 
molecule. Likewise "immunologically active" or "inmiunological activity" refers to the 
capability of the natural, recombinant or synthetic polypeptide to induce a specific 
inunune response in appropriate animals or cells and to bind with specific antibodies. 
The term "activated cells" as used in this application are those cells which are 

25 engaged in extracellular or intracellular membrane trafficking, including the export of 
secretory or enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded 

30 molecules may be "partial" such that only some of the nucleic acids bind or it may be 

"complete" such that total complementarity exists between the single stranded molecules. 
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The degree of complementarity between the nucleic acid strands has significant effects on 
the efficiency and strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term 

5 "germ line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that 
provide a steady and continuous source of germ cells for the production of gametes. The 
term "primordial germ cells (PGCs)" refers to a small population of cells set aside from 
other cell lineages particularly from the yolk sac, mesenteries, or gonadal ridges during 
embryogenesis that have the potential to differentiate into germ cells and other cells. 

10 PGCs are the source from which GSCs and ES cells are derived The PGCs, the GSCs 
and the ES cells are capable of self-renewal. Thus these cells not only populate the germ 
line and give rise to a plurality of terminally differentiated cells that comprise the adult 
specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides 

15 which modulates the expression of an operably linked ORE or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably 
linked sequence" when the expression of the sequence is altered by the presence of the 
EMF. EMFs include, but are not limited to, promoters, and promoter modulating 
sequences (inducible elements). One class of EMFs are nucleic acid fragments which 

20 induce the expression of an operably linked ORF in response to a specific regulatory 
factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonucleotide" are used interchangeably and refer to a heteropolymer of nucleotides or 
the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic 

25 or synthetic origin which may be single-stranded or double-stranded and may represent 
the sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or 
RNA-like material. In the sequences herein A is adenine, C is cytosine, T is thymine, G 
is guanine and N is A, C, G or T (U). It is contemplated that where the polynucleotide is 
RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). 

30 Generally, nucleic acid segments provided by this invention may be assembled from 
fragments of the genome and short oligonucleotide linkers, or from a series of 

8 
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oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid 
which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," 

5 or "segment" or "probe" or "primer" are used interchangeably and refer to a sequence of 
nucleotide residues which are at least about 5 nucleotides, more preferably at least about 
7 nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 
1 1 nucleotides and most preferably at least about 17 nucleotides. The fragment is 
preferably less than about 500 nucleotides, preferably less than about 200 nucleotides, 

10 more preferably less than about 100 nucleotides, more preferably less than about 50 
nucleotides and most preferably less than 30 nucleotides. Preferably the probe is from 
about 6 nucleotides to about 200 nucleotides, preferably from about 15 to about 50 
nucleotides, more preferably from about 17 to 30 nucleotides and most preferably from 
about 20 to 25 nucleotides. Preferably the fragments can be used in polymerase chain 

15 reaction (PGR), various hybridization procedures or microarray procedures to identify or 
amplify identical or related parts of mRNA or DNA molecules. A fragment or segment 
may uniquely identify each polynucleotide sequence of the present invention. Preferably 
the fragment comprises a sequence substantially similar to any one of SEQ ID NOs:l- 
526. 

20 Probes may, for example, be used to determine whether specific mRNA 

molecules are present in a cell or tissue or to isolate similar nucleic acid sequences from 
chromosomal DNA as described by Walsh et al. (Walsh, P.S. et aL, 1992, PGR Methods 
Appl 1:241-250). They may be labeled by nick translation, Klenow fill-in reaction, PGR, 
or other methods well known in the art. Probes of the present invention, their preparation 

25 and/or labeling are elaborated in Sambrook, J. et al., 1989, Molecular Gloning: A 

Laboratory Manual, Cold Spring Harbor Laboratory, NY; or Ausubel, P.M. et aL, 1989, 
Current Protocols in Molecular Biology, John Wiley & Sons, New York NY, both of 
which are incorporated herein by reference in their entirety. 

The nucleic acid sequences of the present invention also include the sequence 

30 information from the nucleic acid sequences of SEQ ID NOs: 1-526. The sequence 

information can be a segment of any one of SEQ ID NOs: 1-526 that uniquely identifies 
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or represents the sequence information of that sequence of SEQ ID NO: 1 -526. One such 
segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
mer is fully matched in the human genome is 1 in 300. In the human genome, there are 
three billion base pairs in one set of chromosomes. Because 4^^ possible twenty-mers 

5 exist, there are 300 times more twenty-mers than there are base pairs in a set of human 
chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in 
arrays for expression studies, fifleen-mer segments can be used. The probability that the 
fifteen-mer is fiilly matched in the expressed sequences is also approximately one in five 

10 because expressed sequences comprise less than approximately 5% of the entire genome 
sequence. 

Similarly, when using sequence information for detecting a single mismatch, a 
segment can be a twenty-five men The probability that the twenty-five mer would appear in 
a human genome with a single mismatch is calculated by multiplying the probability for a 

15 full match (1-5-4^^) times the increased probability for mismatch at each nucleotide position 
(3 x 25). The probability that an eighteen mer with a single mismatch can be detected in an 
array for expression studies is approximately one in five. The probability that a twenty-mer 
with a single mismatch can be detected in a human genome is approximately one in five. 

The term "open reading frame," ORF, means a series of nucleotide triplets coding 

20 for amino acids without any termination codons and is a sequence translatable into 
protein. 

The terms "operably linked*' or "operably associated" refer to functionally related 
nucleic acid sequences. For example, a promoter is operably associated or operably 
linked with a coding sequence if the promoter controls the transcription of the coding 
25 sequence. While operably linked nucleic acid sequences can be contiguous and in the 
same reading fi^ime, certain genetic elements e.g. repressor genes are not contiguously 
linked to the coding sequence but still control transcription/translation of the coding 
sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a 
30 number of differentiated cell types that are present in an adult organism. A pluripotent 
cell is restricted in its differentiation capability in comparison to a totipotent cell. 

10 
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The tenns "polypeptide" or "peptide" or "amino acid sequence" refer to an 
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to 
naturally occurring or synthetic molecules. A polypeptide "fragment," "portion," or 
"segment" is a stretch of amino acid residues of at least about 5 amino acids, preferably at 

5 least about 7 amino acids, more preferably at least about 9 amino acids and most 

preferably at least about 17 or more amino acids. The peptide preferably is not greater 
than about 200 amino acids, more preferably less than 150 amino acids and most 
preferably less than 100 amino acids. Preferably the peptide is from about 5 to about 200 
amino acids. To be active, any polypeptide must have sufficient length to display 

1 0 biological and/or immunological activity. 

The term "naturally occuning polypeptide" refers to polypeptides produced by 
cells that have not been genetically engineered and specifically contemplates various 
polypeptides arising from post-translational modifications of the polypeptide including, 
but not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation 

15 and acylation. 

The term "translated protein coding portion" means a sequence which encodes for 
the full length protein which may include any leader sequence or any processing 
sequence. 

The term "mature protein coding sequence" means a sequence which encodes a 
20 peptide or protein without a signal or leader sequence. The "mature protein portion" 

means that portion of the protein which does not include a signal or leader sequence. The 
peptide may have been produced by processing in the cell which removes any 
leader/signal sequence. The mature protein portion may or may not include the initial 
methionine residue. The methionine residue may be removed from the protein during 
25 processing in the cell. The peptide may be produced synthetically or the protein may 
have been produced using a polynucleotide only encoding for the mature protein coding 
sequence. 

The term "derivative" refers to polypeptides chemically modified by such 
techniques as ubiquitination, labeling (e.g., with radionuclides or various enzymes), 
30 covalent polymer attachment such as pegylation (derivatization with polyethylene glycol) 
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and insertion or substitution by chemical synthesis of amino acids such as ornithine, 
which do not normally occur in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created 

5 using, e g., recombinant DNA techniques. Guidance in determining which amino acid 
residues may be replaced, added or deleted without abolishing activities of interest, may 
be found by comparing the sequence of the particular polypeptide with that of 
homologous peptides and minimizing the number of amino acid sequence changes made 
in regions of high homology (conserved regions) or by replacing amino acids with 

10 consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides 
may be synthesized or selected by making use of the "redundancy" in the genetic code. 
Various codon substitutions, such as the silent changes which produce various restriction 
sites, may be introduced to optimize cloning into a plasmid or viral vector or expression 

15 in a particular prokaryotic or eukaryotic system. Mutations in the polynucleotide 

sequence may be reflected in the polypeptide or domains of other peptides added to the 
polypeptide to modify the properties of any part of the polypeptide, to change 
characteristics such as ligand-binding affinities, interchain affinities, or 
degradation/turnover rate. 

20 Preferably, amino acid "substitutions" are the result of replacing one amino acid 

with another amino acid having similar structural and/or chemical properties, i.e., 
conservative amino acid replacements. "Conservative" amino acid substitutions may be 
made on the basis of similarity in polarity, charge, solubility, hydrophobicity, 
hydrophilicity, and/or the amphipathic nature of the residues involved. For example, 

25 nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, 
phenylalanine, tryptophan, and methionine; polar neutral amino acids include glycine, 
serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) 
amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino 
acids include aspartic acid and glutamic acid. "Insertions" or "deletions" are preferably 

30 in the range of about 1 to 20 amino acids, more preferably 1 to 10 amino acids. The 

variation allowed may be experimentally determined by systematically making insertions, 
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deletions, or substitutions of amino acids in a polypeptide molecule using recombinant 
DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such 

S alterations can, for example, alter one or more of the biological functions or biochemical 
characteristics of the polypeptides of the invention. For example, such alterations may 
change polypeptide characteristics such as ligand-binding affinities, interchain afifmities, 
or degradation/turnover rate. Further, such alterations can be selected so as to generate 
polypeptides that are better suited for expression, scale up and the like in the host cells 

10 chosen for expression. For example, cysteine residues can be deleted or substituted with 
another amino acid residue in order to eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the 
indicated nucleic acid or polypeptide is present in the substantial absence of other 
biological macromolecules, e.g., polynucleotides, proteins, and the like. In one 

15 embodiment, the polynucleotide or polypeptide is purified such that it constitutes at least 
95% by weight, more preferably at least 99% by weight, of the indicated biological 
macromolecules present (but water, bufifers, and other small molecules, especially 
molecules having a molecular weight of less than 1000 daltons, can be present). 
The term "isolated" as used herein refers to a nucleic acid or polypeptide 

20 separated fi-om at least one other component (e.g., nucleic acid or polypeptide) present 
with the nucleic acid or polypeptide in its natural source. In one embodiment, the nucleic 
acid or polypeptide is found in the presence of (if anything) only a solvent, buffer, ion, or 
other component normally present in a solution of the same. The terms "isolated" and 
"purified" do not encompass nucleic acids or polypeptides present in their natural source. 

25 The term "recombinant," when used herein to refer to a polypeptide or protein, 

means that a polypeptide or protein is derived fi'om recombinant (e.g., microbial, insect, 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or 
proteins made in bacterial or fiingal (e.g., yeast) expression systems. As a product, 
"recombinant microbial" defines a polypeptide or protein essentially fi-ee of native 

30 endogenous substances and unaccompanied by associated native glycosylation. 

Polypeptides or proteins expressed in most bacterial cultures, e.g„ E. coli, will be fi-ee of 
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glycosylation modifications; polypeptides or proteins expressed in yeast will have a 
glycosylation pattern in general different from those expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage 
or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An 

5 expression vehicle can comprise a transcriptional unit comprising an assembly of (1) a 
genetic element or elements having a regulatory role in gene expression, for example, 
promoters or enhancers, (2) a structural or coding sequence which is transcribed into 
mRNA and translated into protein, and (3) appropriate transcription initiation and 
termination sequences. Structural units intended for use in yeast or eukaryotic expression 

10 systems preferably include a leader sequence enabling extracellular secretion of 

translated protein by a host cell. Alternatively, where recombinant protein is expressed 
without a leader or transport sequence, it may include an amino terminal methionine 
residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

15 The term "recombinant expression system" means host cells which have stably 

integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extrachromosomally. Recombinant expression systems 
as defined herein will express heterologous polypeptides or proteins upon induction of 
the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

20 This term also means host cells which have stably integrated a recombinant genetic 

element or elements having a regulatory role in gene expression, for example, promoters 
or enhancers. Recombinant expression systems as defined herein will express 
polypeptides or proteins endogenous to the cell upon induction of the regulatory elements 
linked to the endogenous DNA segment or gene to be expressed. The cells can be 

25 prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a 
membrane, including transport as a result of signal sequences in its amino acid sequence 
when it is expressed in a suitable host cell. "Secreted" proteins include without limitation 
proteins secreted wholly (e.g., soluble proteins) or partially (eg., receptors) from the cell 

30 in which they are expressed. "Secreted" proteins also include without limitation proteins 
that are transported across the membrane of the endoplasmic reticulum. "Secreted" 
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proteins are also intended to include proteins containing non-typical signal sequences 
(e.g. Interleukin-1 Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2):134 
-143) and factors released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, 
see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 16:27-55) 

5 Where desired, an expression vector may be designed to contain a "signal or 

leader sequence" which will direct the polypeptide through the membrane of a cell. Such 
a sequence may be naturally present on the polypeptides of the present invention or 
provided from heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood 

10 in the art as stringent. Stringent conditions can include highly stringent conditions (i.e., 
hybridization to filter-bound DNA in 0.5 M NaHP04, 7% sodium dodecyl sulfate (SDS), 
1 mM EDTA at 65X, and washing in O.IX SSC/0.1% SDS at 68**C), and moderately 
stringent conditions (i.e., washing in 0.2X SSCyO. 1 % SDS at 42°C)- Other exemplary 
hybridization conditions are described herein in the examples. 

15 In instances of hybridization of deoxyoligonucleotides, additional exemplary 

stringent hybridization conditions include washing in 6X SSC/O.05% sodium 
pyrophosphate at BT^'C (for 14-base oligonucleotides), 48**C (for 17-base-oligos), 55^*0 
(for 20-base oligonucleotides), and 60^C (for 23 -base oligonucleotides). 

As used herein, "substantially equivalent" or '^substantially similar" can refer both 

20 to nucleotide and amino acid sequences, for example a mutant sequence, that varies from 
a reference sequence by one or more substitutions, deletions, or additions, the net effect 
of which does not result in an adverse functional dissimilarity between the reference and 
subject sequences. Typically, such a substantially equivalent sequence varies fcom one of 
those listed herein by no more than about 35% (i.e., the number of individual residue 

25 substitutions, additions, and/or deletions in a substantially equivalent sequence, as 
compared to the corresponding reference sequence, divided by the total number of 
residues in the substantially equivalent sequence is about 0.35 or less). Such a sequence 
is said to have 65% sequence identity to the listed sequence. In one embodiment, a 
substantially equivalent, e.g., mutant, sequence of the invention varies from a listed 

30 sequence by no more than 30% (70% sequence identity); in a variation of this 

embodiment, by no more than 25% (75% sequence identity); and in a further variation of 
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this embodiment, by no more than 20% (80% sequence identity) and in a further variation 
of this embodiment, by no more than 10% (90% sequence identity) and in a further 
variation of this embodiment, by no more that 5% (95% sequence identity). Substantially 
equivalent, e.g., mutant, amino acid sequences according to the invention preferably have 

5 at least 80% sequence identity with a listed amino acid sequence, more preferably at least 
85% sequence identity, more preferably at least 90% sequence identity, more preferably 
at least 95% sequence identity, more preferably at least 98% sequence identity, and most 
preferably at least 99% sequence identity. Substantially equivalent nucleotide sequence 
of the invention can have lower percent sequence identities, taking into account, for 

10 example, the redundancy or degeneracy of the genetic code. Preferably, the nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, 
more preferably at least about 80% sequence identity, more preferably at least 85% 
sequence identity, more preferably at least 90% sequence identity, more preferably at 
least about 95% sequence identity, more preferably at least 98% sequence identity, and 

15 most preferably at least 99% sequence identity. For the purposes of the present 

invention, sequences having substantially equivalent biological activity and substantially 
equivalent expression characteristics are considered substantially equivalent. For the 
purposes of determining equivalence, truncation of the mature sequence (eg., via a 
mutation which creates a spurious stop codon) should be disregarded. Sequence identity 

20 may be determined, e.g., using the Jotun Hein method (Hein, J. (1990) Methods 

Enzymol. 1 83:626-645). Identity between sequences can also be determined by other 
methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of 
the cell types of an adult organism. 

25 The term "transformation" means introducing DNA into a suitable host cell so 

that the DNA is replicable, either as an extrachromosomal element, or by chromosomal 
integration. The term "transfection" refers to the taking up of an expression vector by a 
suitable host cell, whether or not any coding sequences are in fact expressed. The term 
"infection*' refers to the introduction of nucleic acids into a suitable host cell by use of a 

30 virus or viral vector. 
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As used herein, an "uptake modulating fragment," UMF, means a series of 
nucleotides which mediate the uptake of a linked DNA fragment into a cell. UMFs can 
be readily identified using known UMFs as a target sequence or target motif with the 
computer-based systems described below. The presence and activity of a UMF can be 

5 confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic 
acid molecule is then incubated with an appropriate host under appropriate conditions and 
the uptake of the marker sequence is determined. As described above, a UMF will 
increase the frequency of uptake of a linked marker sequence. 

Each of the above terms is meant to encompass all that is described for each, 

1 0 unless the context dictates otherwise. 

3.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising 

15 the nucleotide sequences of SEQ ID NO: 1 - 526; a polynucleotide encoding any one of 
the peptide sequences of SEQ ID NO: 527 - 1052; and a polynucleotide comprising the 
nucleotide sequence encoding the mature protein coding sequence of the.polynucleotides 
of any one of SEQ ID NO: 1 - 526. The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent conditions 

20 to (a) the complement of any of the nucleotides sequences of SEQ ID NO: 1 - 526; (b) 
nucleotide sequences encoding any one of the amino acid sequences set forth in the 
Sequence Listing as SEQ ID NO: 527 - 1052; (c) a polynucleotide which is an allelic 
variant of any polynucleotide recited above; (d) a polynucleotide which encodes a 
species homolog of any of the proteins recited above; or (e) a polynucleotide that encodes 

25 a polypeptide comprising a specific domain or truncation of the polypeptides of SEQ ID 
NO: 527 - 1052. Domains of interest may depend on the nature of the encoded 
polypeptide; e.g., domains in receptor-like polypeptides include ligand-binding, 
extracellular, transmembrane, or cytoplasmic domains, or combinations thereof; domains 
in immunoglobulin-like proteins include the variable immunoglobulin-like domains; 

30 domains in enzyme-like polypeptides include catalytic and substrate binding domains; 
and domains in ligand polypeptides include receptor-binding domains. 
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The polynucleotides of the invention include naturally occurring or wholly or 
partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The 
polynucleotides may include all of the coding region of the cDNA or may represent a 
portion of the coding region of the cDNA. 

5 The present invention also provides genes corre^onding to the cDNA sequences 

disclosed herein. The con-esponding genes can be isolated in accordance with known 
methods using the sequence information disclosed herein. Such methods include the 
preparation of probes or primers from the disclosed sequence infomiation for identification 
and/or amplification of genes in appropriate genomic libraries or other sources of genomic 

10 materials. Further 5* and 3' sequence can be obtained using methods known in the art. For 
example, full length cDNA or genomic DNA that corresponds to any of the polynucleotides 
of SEQ ID NO: 1 - 526 can be obtained by screening appropriate cDNA or genomic DNA 
libraries under suitable hybridization conditions using any of the polynucleotides of SEQ ID 
NO: 1 - 526 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID 

1 S NO: 1 - 526 may be used as the basis for suitable primer(s) that allow idratification and/or 
amplification of genes in appropriate genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled fix}m'ESTs and 
sequences (including cDNA and genomic sequences) obtained &om one or more public 
databases, such as dbEST, gbpri, and UniGene. The EST sequences can provide identifying 

20 sequence information, representative fi-agment or segment information, or novel segment 
information for the fiiU-length gene. 

The polynucleotides of the invention also provide polynucleotides including 
nucleotide sequences that are substantially equivalent to the polynucleotides recited 
above. Polynucleotides according to the invention can have, e.g,, at least about 65%, at 

25 least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more 
typically at least about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 
91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99% 
sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are 

30 nucleic acid sequence fragments that hybridize under stringent conditions to any of the 
nucleotide sequences of SEQ ID NO: 1 - 526, or complements thereof, which firagment is 
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greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 
nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 
20 nucleotides or more that are selective for (i.e. specifically hybridize to) any one of the 
polynucleotides of the invention are contemplated. Probes capable of specifically 
5 hybridizing to a polynucleotide can differentiate polynucleotide sequences of the 
invention from other polynucleotide sequences in the same family of genes or can 
differentiate human genes from genes of other species, and are preferably based on 
unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to 
1 0 these specific sequences, but also include allelic and species variations thereof. Allelic and 
species variations can be routinely determined by comparing the sequence provided in SEQ 
ID NO: 1 - 526, a representative fragment thereof, or a nucleotide sequence at least 90% 
identical, preferably 95% identical, to SEQ ID NOs: 1 - 526 witfi a sequence from another 
isolate of the same species. Furthermore, to acconmiodate codon variability, the invention 
15 includes nucleic acid molecules coding for the same amino acid sequences as do the specific 
ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one 
codon for another codon that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present 
invention, including SEQ ID NOs: I - 526, can be obtained by searching a database using an 
20 algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment 
Search Tool is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 
290-300 (1993) and Altschul S.F. et al. J. Mol Biol. 21:403-410 (1990)). Alternatively a 
FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are 
25 also provided by the present invention. Species homologs may be isolated and identified 
by making suitable probes or primers from the sequences provided herein and screening a 
suitable nucleic acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides 
or proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide 
30 which also encode proteins which are identical, homologous or related to that encoded by 
the polynucleotides. 
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The nucleic acid sequences of the invention are further directed to sequences 
which encode variants of the described nucleic acids. These amino acid sequence 
variants may be prepared by methods known in the art by introducing appropriate 
nucleotide changes into a native or variant polynucleotide. There are two variables in the 

S construction of amino acid sequence variants: the location of the mutation and the nature 
of the mutation. Nucleic acids encoding the amino acid sequence variants are preferably 
constructed by mutating the polynucleotide to encode an amino acid sequence that does 
not occur in nature. These nucleic acid alterations can be made at sites that differ in the 
nucleic acids from different species (variable positions) or in highly conserved regions 

10 (constant regions). Sites at such locations will typically be modified in series, e.g., by 
substituting first with conservative choices (e.g., hydrophobic amino acid to a different 
hydrophobic amino acid) and then with more distant choices (eg., hydrophobic amino 
acid to a charged amino acid), and then deletions or insertions may be made at the target 
site. Amino acid sequence deletions generally range from about 1 to 30 residues, 

15 preferably about I to 10 residues, and are typically contiguous. Amino acid insertions 
include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino 
acid residues. Intrasequence insertions may range generally from about 1 to 10 amino 
residues, preferably from 1 to 5 residues. Examples of terminal insertions include the 

20 heterologous signal sequences necessary for secretion or for intracellular targeting in 
different host cells and sequences such as FLAG or poly-histidine sequences useful for 
purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences 
are changed via site-directed mutagenesis. This method uses oligonucleotide sequences 

25 to alter a polynucleotide to encode the desired amino acid variant, as well as sufficient 
adjacent nucleotides on both sides of the changed amino acid to form a stable duplex on 
either side of the site of being changed. In general, the techniques of site-directed 
mutagenesis are well known to those of skill in the art and this technique is exemplified 
by publications such as, Edelman et al., DNA 2:183 (1983). A versatile and efficient 

30 method for producing site-specific changes in a polynucleotide sequence was published 
by ZoUer and Smith, Nucleic Acids Res. 10:6487-6500 (1982). PGR may also be used to 
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create amino acid sequence variants of the novel nucleic acids. When small amounts of 
template DNA are used as starting material, primer(s) that differs slightly in sequence 
from the corresponding region in the template DNA can generate the desired amino acid 
variant. PGR amplification results in a population of product DNA fragments that differ 

5 from the polynucleotide template encoding the polypeptide at the position specified by 
the primer. The product DNA fragments replace the corresponding region in the plasmid 
and this gives a polynucleotide encoding the desired amino acid variant. 

A fiirther technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis 

10 techniques well known in the art, such as, for example, the techniques in Sambrook et al., 
supra, and Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent 
degeneracy of the genetic code, other DNA sequences which encode substantially the 
same or a ifimctionaUy equivalent amino acid sequence may be used in the practice of the 
invention for the cloning and expression of these novel nucleic acids. Such DNA 

15 sequences include those which are capable of hybridizing to the appropriate novel nucleic 
acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can 
be used to generate polynucleotides encoding chimeric or fusion proteins comprising one 
or more domains of the invention and heterologous protein sequences. 

20 The polynucleotides of the invention additionally include the complement of any 

of the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, 
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such 
polynucleotides are well known to those of skill in the art and can include, for example, 
methods for determining hybridization conditions that can routinely isolate 

25 polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the 
mature protein coding sequences corresponding to any one of SEQ ID NO: 1-526, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that 
direct the expression of that nucleic acid, or a functional equivalent thereof, in 

30 appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 
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A polynucleotide according to the invention can be joined to any of a variety of 
other nucleotide sequences by well-established recombinant DNA techniques (see 
Sambrook J et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY). Useful nucleotide sequences for joining to polynucleotides include an 

5 assortment of vectors, e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and 
the like, that arc well known in the art. Accordingly, the invention also provides a vector 
including a polynucleotide of the invention and a host cell containing the polynucleotide. 
In general, the vector contains an origin of replication functional in at least one organism, 
convenient restriction endonuclease sites, and a selectable marker for the host cell. 

10 Vectors according to the invention include expression vectors, replication vectors, probe 
generation vectors, and sequencing vectors. A host cell according to the invention can be 
a prokaryotic or eukaryotic cell and can be a unicellular organism or part of a 
multicellular organism. 

The present invention further provides recombinant constructs comprising a 

15 nucleic acid having any of the nucleotide sequences of SEQ ID NOs: 1 - 526 or a 

fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or 
viral vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID 
NOs: 1 - 526 or a fragment thereof is inserted, in a forward or reverse orientation. In the 

20 case of a vector comprising one of the ORFs of the present invention, the vector may 
further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. Large numbers of suitable vectors and promoters are known to those 
of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of 

25 example. Bacterial: pBs,phagescript,PsiX174,pBluescript SK,pBsKS,pNH8a, 
pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, 
pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Phannacia), 

The isolated polynucleotide of the invention may be operably linked to an 

30 expression control sequence such as the pMT2 or pED expression vectors disclosed in 
Kaufinan et al. Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein 
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recombinantly. Many suitable expression control sequences are known in the art. 
General methods of expressing recombinant proteins are also known and are exemplified 
in R. Kaufman, Methods in Enzymology 185, 537-566 (1990). As defined herein 
"operably linked" means that the isolated polynucleotide of the invention and an 

5 expression control sequence are situated within a vector or cell in such a way that the 
protein is expressed by a host cell which has been transformed (transfected) with the 
ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 

10 appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters 
include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV 
immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and 
mouse metallothionein-I. Selection of the appropriate vector and promoter is well within 
the level of ordinary skill in the art. Generally, recombinant expression vectors will 

1 5 include origins of replication and selectable markers permitting transformation of the host 
cell, e.g., the ampicillin resistance gene of E. coli and 5. cerevisiae TRPl gene, and a 
promoter derived from a highly-expressed gene to direct transcription of a downstream 
structural sequence. Such promoters can be derived from operons encoding glycolytic 
enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat 

20 shock proteins, among others. The heterologous structural sequence is assembled in 

appropriate phase with translation initiation and termination sequences, and preferably, a 
leader sequence capable of directing secretion of translated protein into the periplasmic 
space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

25 characteristics, e,g. , stabilization or simplified purification of expressed recombinant 
product. Useful expression vectors for bacterial use are constructed by inserting a 
structural DNA sequence encoding a desired protein together with suitable translation 
initiation and termination signals in operable reading phase with a functional promoter. 
The vector will comprise one or more phenotypic selectable markers and an origin of 

30 replication to ensure maintenance of the vector and to, if desirable, provide amplification 
within the host. Suitable prokaryotic hosts for transformation include E. coli. Bacillus 
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subtilis. Salmonella typhimurium and various species within the genera Pseudomonas, 
Streptomyces, and Staphylococcus, although others may also be employed as a matter of 
choice. 

As a representative but non-limiting example, useful expression vectors for 

5 bacterial use can comprise a selectable marker and bacterial origin of replication derived 
from commercially available plasmids comprising genetic elements of the well known 
cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, 
pKK223-3 (Phannacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega 
Biotech, Madison, WI, USA). These pBR322 "backbone" sections are combined with an 

1 0 appropriate promoter and the structural sequence to be expressed. Following 

transformation of a suitable host strain and growth of the host strain to an appropriate cell 
density, the selected promoter is induced or derepressed by appropriate means (e.g., 
temperature shift or chemical uiduction) and cells are cultured for an additional period. 
Cells are typically harvested by centrifugation, disrupted by physical or chemical means, 

15 and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. 
For example, as described in Fan et al, Nat Biotech. 17:870-872 (1999)$ incorporated 
herein by reference, nucleic acid sequences encoding a polypeptide may be used to 
generate antibodies against the encoded polypeptide following topical administration of 

20 naked plasmid DNA or following injection, and preferably intra-muscular injection of the 
DNA, The nucleic acid sequences are preferably inserted in a recombinant expression 
vector and may be in the form of naked DNA. 

33 ANTISENSE 

25 Another aspect of the invention pertains to isolated antisense nucleic acid 

molecules that are hybridizable to or complementary to the nucleic acid molecule 
comprising the nucleotide sequence of SEQ ID NO: 1 - 526, or firagments, analogs or 
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the 

30 coding strand of a double-stranded cDNA molecule or complementary to an mRNA 
sequence. In specific aspects, antisense nucleic acid molecules are provided that 
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comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 
nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid 
molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 527 -1052 or antisense nucleic acids complementary to a nucleic acid 

5 sequence of SEQ ID NO: 1 - 526 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence of the invention. The temi "coding 
region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 

10 molecule is antisense to a "noncoding region" of the coding strand of a nucleotide 

sequence of the invention. The term "noncoding region" refers to 5' and 3' sequences that 
^flank the coding region that are not translated into amino acids (z.e., also referred to as 5' 
and 3* untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., 

15 SEQ ID NO: 1 - 526, antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid 
molecule can be complementary to the entire coding region of an mRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or 
noncoding region of an mRNA. For example, the antisense oligonucleotide can be 

20 complementary to the region surrounding the translation start site of an mRNA. An 
antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 
50 nucleotides in length. An antisense nucleic acid of the invention can be constructed 
using chemical synthesis or enzymatic ligation reactions using procedures known in the 
art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be 

25 chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase 
the physical stability of the duplex formed between the antisense and sense nucleic acids, 
e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
Examples of modified nucleotides that can be used to generate the antisense 

30 nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, 
hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 
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5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, 
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 

1- methylguanine, 1-methylinosine, 2,2-diinethylguanine, 2-methyladenine, 

2- methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, T-methylguanine, 
5 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 

beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 
2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 
5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 

10 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 

2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced biologically 
using an expression vector into which a nucleic acid has been subcloned in an antisense 
orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense 
orientation to a target nucleic acid of interest, described further in the following 

IS subsection). 

The antisense nucleic acid molecules of the invention are typically administered 
to a subject or generated in situ such that they hybridize with or bind to cellular inRNA 
and/or genomic DNA encoding a protein according to the invention to thereby inhibit 
expression of the protein, e.g., by inhibiting transcription and/or translation. The 

20 hybridization can be by conventional nucleotide complementarity to form a stable duplex, 
or, for example, in the case of an antisense nucleic acid molecule that binds to DNA 
duplexes, through specific interactions in the major groove of the double helix. An 
example of a route of administration of antisense nucleic acid molecules of the invention 
includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules 

25 can be modified to target selected cells and then administered systemically. For example, 
for systemic administration, antisense molecules can be modified such that they 
specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by 
linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell 
surface receptors or antigens. The antisense nucleic acid molecules can also be delivered 

30 to cells using the vectors described herein. To achieve sufficient intracellular 

concentrations of antisense molecules, vector constructs in which the antisense nucleic 
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acid molecule is placed under the control of a strong pol II or pol III promoter are 
prefenred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is 
an a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule foms 
specific double-stranded hybrids with complementary RNA in which, contrary to the usual 
a-units, the strands run parallel to each other (Gaultier et al (1987) Nucleic Acids Res 15: 
6625-6641). The antisense nucleic acid molecule can also comprise a 
2'-o-methylribonucleotide (Inoue et al, (1987) Nucleic Acids Res 15: 613 1-6148) or a 
chimeric RNA -DNA analogue (Inoue et al (1987) FEBS Lett 215: 327-330). 

3.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a 
ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity that are 
capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have 
a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in 
Haselhoff and Geriach (1988) Nature 334:585-591)) can be used to catalytically cleave 
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having 
specificity for a nucleic acid of the invention can be designed based upon the nucleotide 
sequence of a DNA disclosed herein (i.e., SEQ ID NO: 1 - 526). For example, a 
derivative of Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide 
sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 
mRNA. See,e.g.,Cecherfl/. U.S. Pat. No. 4,987,071; and Cech era/. U.S. Pat. No. 
5,1 16,742. Alternatively, mRNA of the invention can be used to select a catalytic RNA 
having a specific ribonuclease activity firom a pool of RNA molecules. See, e.g., Bartel 
et fl/., (1993) 5czence 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple 
helical structures that prevent transcription of the gene in target cells. See generally, 
Helene. (1991) Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann, NY. Acad, 
Sci. 660:27-36; and Maher (1992) Bioassays 14: 807-15. 
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In various embodiments, the nucleic acids of the invention can be modified at the 
base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 
hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate peptide nucleic acids (see 
Hyrup et al. (1 996) Bioorg Med Chem 4: 5-23). As used herein, the terms "peptide 
nucleic acids" or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the 
deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the 
four natural nucleobases are retained. The neutral backbone of PNAs has been shown to 
allow for specific hybridization to DNA and RNA under conditions of low ionic strength. 
The synthesis of PNA oligomers can be performed using standard solid phase peptide 
synthesis protocols as described in Hyrup et ai (1996) above; Perry-O'Keefe et aL (1996) 
PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific 
modulation of gene expression by, e.g., inducing transcription or translation arrest or 
inhibiting replication. PNAs of the invention can also be used, e.g, in the analysis of 
single base pair mutations in a gene by, eg., PNA directed PGR clamping; as artificial 
restriction enzymes when used in combination with other enzymes, e.g., SI nucleases 
(Hyrup B. (1996) above); or as probes or primers for DNA sequence and hybridization 
(Hyrup et aL (1996), above; Perry-O'Keefe (1996), above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance 
their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by 
the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of 
dmg delivery known in the art. For example, PNA-DNA chimeras can be generated that 
may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, eg., RNase H and DNA polymerases, to interact with the DNA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms 
of base stacking, number of bonds between the nucleobases, and orientation (Hyrup 
(1996) above). The synthesis of PNA-DNA chimeras can be performed as described in 
Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a 
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DNA chain can be synthesized on a solid support using standard phosphoramidite 
coupling chemistry, and modified nucleoside analogs, e.g., 

5 -(4-methoxytrityl)amino-5 -deoxy-thymidine phosphoramidite, can be used between the 
PNA and the 5' end of DNA (Mag et ai (1989) Nucl Acid Res 17: 5973-88). PNA 
monomers are then coupled in a stepwise manner to produce a chimeric molecule with a 
5' PNA segment and a 3* DNA segment (Finn et al, (1996) above). Altematively, 
chimeric molecules can be synthesized with a 5' DNA segment and a 3' PNA segment. 
See, Petersen et aL (1975) BioorgMed Chem Lett 5:1119-11 124. 

In other embodiments, the oligonucleotide may include other appended. groups 
such as peptides (eg., for targeting host cell receptors in vivo), or agents facilitating 
transport across the cell membrane (see, e.g., Letsinger et al, 1989, Proc, Natl Acad, Set 
U.S.A. 86:6553-6556; Lemaitre et al, 1987, Proc. Natl Acad, Scl 84:648-652; PCT 
Publication No. W088/09810) or the blood-brain barrier (see, eg., PCT Publication No. 
W089/10134). In addition, oligonucleotides can be modified with hybridization triggered 
cleavage agents (See, eg., Krol et al, 1988, BioTechniques 6:958-976) or intercalating 
agents. (See, eg., Zon, 1988, Pharm. Res, 5: 539-549). To this end, the oligonucleotide 
may be conjugated to another molecule, eg., a peptide, a hybridization triggered 
cross-linking agent, a transport agent, a hybridization-triggered cleavage agent, etc. 

3.5 HOSTS 

The present invention further provides host cells genetically engineered to contain 
the polynucleotides of the invention. For example, such host cells may contain nucleic 
acids of the invention introduced into the host cell using known transfonnation, 
transfection or infection methods. The present invention still further provides host cells 
genetically engineered to express the polynucleotides of the invention, wherein such 
polynucleotides are in operative association with a regulatory sequence heterologous to 
the host cell which drives expression of the polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, 
or increase, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in 
whole or in part, the naturally occurring promoter with all or part of a heterologous 



29 



wo 02/074961 



PCTAJS02/05109 



promoter so that the cells express the polypeptide at higher levels. The heterologous 
promoter is inserted in such a manner that it is operatively linked to the encoding 
sequences. See, for example, PCT International Publication No. WO94/12650, PCT 
International Publication No. WO92/20808, and PCT International Publication No. 
W09 1/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) 
and/or intron DNA may be inserted along with the heterologous promoter DNA. If 
linked to the coding sequence, amplification of the marker DNA by standard selection 
methods results in co-amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a 
lower eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, 
such as a bacterial cell. Introduction of tfie recombinant construct into the host cell can 
be effected by calcium phosphate transfection, DEAE, dextran mediated transfection, or 
electroporation (Davis, L. et al, Basic Methods in Molecular Biology (1986)). The host 
cells containing one of the polynucleotides of the invention, can be used in conventional 
manners to produce the gene product encoded by the isolated firagment (in the case of an 
ORF) or can be used to produce a heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the 
present invention. These include, but are not limited to, eukaryotic hosts such as HeLa 
cells, Cv-1 cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. 
coli and B. subtilis. The most preferred cells are those which do not normally express the 
particular polypeptide or protein or which expresses the polypeptide or protein at low 
natural level. Mature proteins can be expressed in mammalian cells, yeast, bacteria, or 
other cells under the control of appropriate promoters. Cell-free translation systems can 
also be employed to produce such proteins using RNAs derived from the DNA constructs 
of the present invention. Appropriate cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by Sambrook, et al., in Molecular 
Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New York (1989), 
the disclosure of which is hereby incorporated by reference. 
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Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 
lines of monkey kidney fibroblasts, described by Gluzman, Cell 23:175 (1981). Other 
cell lines capable of expressing a compatible vector are, for example, the C127, monkey 

5 COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human 
epidermal A431 cells, human Colo205 cells, 3T3 cells, CV-1 cells, other transformed 
primate cell lines, normal diploid cells, cell strains derived from in vitro culture of 
primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or 
Jurkat cells. Mammalian expression vectors will comprise an origin of replication, a 

10 suitable promoter and also any necessary ribosome binding sites, polyadenylation site, 
splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the S V40 viral genome, for 
example, S V40 origin, early promoter, enhancer, splice, and polyadenylation sites may be 
used to provide the required nontranscribed genetic elements. Recombinant polypeptides 

1 5 and proteins produced in bacterial culture are usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion 
chromatography steps. Protein refolding steps can be used, as necessary, in completing 
configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. Microbial cells employed in 

20 expression of proteins can be disrupted by any convenient method, including freeze-thaw 
cycling, sonication, mechanical disruption, or use of cell lysing agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such 
as yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains 
include Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, 

25 Candida, or any yeast strain capable of expressing heterologous proteins. Potentially 
suitable bacterial strains include Escherichia coli, Bacillus subtilis. Salmonella 
typhimurium, or any bacterial strain capable of expressing heterologous proteins. If the 
protein is made in yeast or bacteria, it may be necessary to modify the protein produced 
therein, for example by phosphorylation or glycosylation of the appropriate sites, in order 

30 to obtain the functional protein. Such covalent attachments may be accomplished using 
known chemical or enzymatic methods. 
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In another embodiment of the present invention, cells and tissues may be 
engineered to express an endogenous gene comprising the polynucleotides of the 
invention under the control of inducible regulatory elements, in which case the regulatory 
sequences of the endogenous gene may be replaced by homologous recombination. As 

5 described herein, gene targeting can be used to replace a gene's existing regulatory region 
with a regulatory sequence isolated from a different gene or a novel regulatory sequence 
synthesized by genetic engineering methods. Such regulatory sequences may be 
comprised of promoters, enhancers, scaffold-attachment regions, negative regulatory 
elements, transcriptional initiation sites, regulatory protein binding sites or combinations 

10 of said sequences. Alternatively, sequences which affect the structure or stability of the 
RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, 
splice sites, leader sequences for enhancing or modifying transport or secretion properties 
of the protein, or other sequences which alter or improve the function or stability of 

1 5 protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing 
the gene under the control of the new regulatory sequence, e.g., inserting a new promoter 
or enhancer or both upstream of a gene. Alternatively, the targeting event may be a 
simple deletion of a regulatory element, such as the deletion of a tissue-specific negative 

20 regulatory element. Alternatively, the targeting event may replace an existing element; 
for example, a tissue-specific enhancer can be replaced by an enhancer that has broader 
or different cell-type specificity than the naturally occurring elements. Here, the 
naturally occurring sequences are deleted and new sequences are added. In all cases, the 
identification of the targeting event may be facilitated by the use of one or more 

25 selectable marker genes that are contiguous with the targeting DNA, allowing for the 
selection of cells in which the exogenous DNA has integrated into the host cell genome. 
The identification of the targeting event may also be facilitated by the use of one or more 
marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the 

30 negatively selectable marker flanks the targeting sequence, and such that a correct 

homologous recombination event with sequences in the host cell genome does not result 
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in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance 
5 with this aspect of the invention are more particularly described in U.S. Patent No. 
5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International 
AppUcation No. PCT/US92/09627 (WO93/09222) by Selden et al.; and International 
Application No. PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is 
incorporated by reference herein in its entirety. 

10 

3.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising: the amino acid sequences set forth as any one of SEQ ID NO: 
527 -1052 or an amino acid sequence encoded by any one of the nucleotide sequences 

15 SEQ ID NOs: 1 - 526 or the conresponding full length or mature protein. Polypeptides of 
the invention also include polypeptides preferably with biological or immunological 
activity that are encoded by: (a) a polynucleotide having any one of the nucleotide 
sequences set forth in SEQ ID NOs: 1 - 526 or (b) polynucleotides encoding any one of 
the amino acid sequences set forth as SEQ ID NO 527 -1052 or (c) polynucleotides that 

20 hybridize to the complement of the polynucleotides of either (a) or (b) under stringent 
hybridization conditions. The invention also provides biologically active or 
immunologically active variants of any of the amino acid sequences set forth as SEQ ID 
NO: 527 -1052 or the corresponding full length or mature protein; and "substantial 
equivalents" thereof (e.g., with at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 
91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more typically at least 
about 98%, or most typically at least about 99% amino acid identity) that retain 
biological activity. Polypeptides encoded by allelic variants may have a similar, 
increased, or decreased activity compared to polypeptides comprising SEQ ID NO: 527 - 

30 1052. 
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Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the 
protein may be in linear form or they may be cyclized using known methods, for 
example, as described in H. U. Saragovi. et al., Bio/Technology 10, 773-778 (1992) and 
5 in R. S. McDowell, et al, J. Amer. Chem. Soc. 1 14, 9245-9253 (1992), both of which are 
incorporated herein by referaice. Such fragments may be fused to carrier molecules such 
as immunoglobulins for many purposes, including increasing the valency of protein 
binding sites. 

The present invention also provides both full-length and mature forms (for 

10 example, without a signal sequence or precursor sequence) of the disclosed proteins. The 
protein coding sequence is identified in the sequence listing by translation of the 
disclosed nucleotide sequences. The mature form of such protein may be obtained by 
expression of a full-length polynucleotide in a suitable mammalian cell or other host cell. 
The sequence of the mature form of the protein is also determinable from the amino acid 

15 sequence of the fulUength form. Where proteins of the present invention are membrane 
bound, soluble forms of the proteins are also provided. In such forms, part or all of the 
regions causing the proteins to be membrane bound are deleted so that the proteins are 
fiilly secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable 

20 carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the 
nucleic acid fragments of the present invention or by degenerate variants of the nucleic 
acid fragments of the present invention. By "degenerate variant" is intended nucleotide 
fragments which differ from a nucleic acid fragment of the present invention (e.g., an 

25 ORF) by nucleotide sequence but, due to the degeneracy of the genetic code, encode an 
identical polypeptide sequence. Preferred nucleic acid fragments of the present invention 
are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of 
the isolated polypeptides or proteins of the present invention. At the simplest level, the 

30 amino acid sequence can be synthesized using commercially available peptide 

synthesizers. The synthetically-constructed protein sequences, by virtue of sharing 
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primary, secondary or tertiary structural and/or conformational characteristics with 
proteins may possess biological properties in common therewith, including protein 
activity. This technique is particularly useful in producing small peptides and fragments 
of larger polypeptides. Fragments are useful, for example, in generating antibodies 

5 against the native polypeptide. Thus, they may be employed as biologically active or 
immunological substitutes for natural, purified proteins in screening of therapeutic 
compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be 
purified fi"om cells which have been altered to express the desired polypeptide or protein. 

10 As used herein, a cell is said to be altered to express a desired polypeptide or protein 
when the cell, through genetic manipulation, is made to produce a polypeptide or protein 
which it normally does not produce or which the cell normally produces at a lower level. 
One skilled in the art can readily adapt procedures for introducing and expressing either 
recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to 

1 5 generate a cell which produces one of the polypeptides or proteins of the present 
invention. 

The invention also relates to methods for producing a polypeptide comprising 
growing a culture of host cells of the invention in a suitable culture medium, and 
purifying the protein from the cells or the culture in which the cells are grown. For 

20 example, the methods of the invention include a process for producing a polypeptide in 
which a host cell containing a suitable expression vector that includes a polynucleotide of 
the invention is cultured under conditions that allow expression of the encoded 
polypeptide. The polypeptide can be recovered from the culture, conveniently from the 
culture medium, or from a lysate prepared from the host cells and further purified. 

25 Preferred embodiments include those in which the protein produced by such process is a 
full length or mature form of the protein. 

In an altemative method, the polypeptide or protein is purified from bacterial 
cells which naturally produce the polypeptide or protein. One skilled in the art can 
readily follow known methods for isolating polypeptides and proteins in order to obtain 

30 one of the isolated polypeptides or proteins of the present invention. These include, but 
are not limited to, immunochromatography, HPLC, size-exclusion chromatography, 
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ion-exchange chromatography, and immuno-affinity chromatography. See, e.g., Scopes, 
Protein Purification: Principles and Practice^ Springer- Verlag (1994); Sambrook, et al., 
in Molecular Cloning: A Laboratory Manual; Ausubel et al.. Current Protocols in 
Molecular Biology, Polypeptide fragments that retain biological/immunological activity 

5 include fragments comprising greater than about 100 amino acids, or greater than about 
200 amino acids, and fragments that encode specific protein domains. 

The purified polypeptides can be used in in vitro binding assays which are well 
known in the art to identify molecules which bind to the polypeptides. These molecules 
include but are not limited to, for e.g., small molecules, molecules from combinatorial 

10 libraries, antibodies or other proteins. The molecules identified in the binding assay are 
then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
that are well known in the art. In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 
the animal/cells. 

15 In addition, the peptides of the invention or molecules capable of binding to the 

peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds 
that are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor or 
other cell by the specificity of the binding molecule for SEQ ED NO: 527 -1052. 

The protein of the invention may also be expressed as a product of transgenic 

20 animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which 
are characterized by somatic or germ cells containing a nucleotide sequence encoding the 
protein. 

The proteins provided herein also include proteins characterized by amino acid 
sequences similar to those of purified proteins but into which modification are naturally 

25 provided or deliberately engineered. For example, modifications, in the peptide or DNA 
sequence, can be made by those skilled in the art using known techniques. Modifications 
of interest in the protein sequences may include the alteration, substitution, replacement, 
insertion or deletion of a selected amino acid residue in the coding sequence. For 
example, one or more of the cysteine residues may be deleted or replaced with another 

30 amino acid to alter the conformation of the molecule. Techniques for such alteration, 
substitution, replacement, insertion or deletion are well known to those skilled in the art 
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(see, e.g., U.S. Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement, 
insertion or deletion retains the desired activity of the protein. Regions of the protein that 
are important for the protein function can be determined by various methods known in 
the art including the alanine-scanning method which involved systematic substitution of 

5 single or strings of amino acids with alanine, followed by testing the resulting 

alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein 
that are important for protein function may be determined by the eMATRK program. 
Other fragments and derivatives of the sequences of proteins which would be 

10 expected to retain protein activity in whole or in part and are useful for screening or other 
immunological methodologies may also be easily made by those skilled in the art given 
the disclosures herein. Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide 
of the invention to suitable control sequences in one or more insect expression vectors, 

15 and employing an insect expression system. Materials and methods for 

baculovirus/insect cell expression systems are commercially available in kit form from, 
eg., Invitrogen, San Diego, CaUf., U.S.A. (the MaxBat™ kit), and such methods are well 
known in the art, as described in Summers and Smith, Texas Agricultural Experiment 
Station Bulletin No. 1555 (1987), incorporated herein by reference. As used herein, an 

20 insect cell capable of expressing a polynucleotide of the present invention is 
"transformed." 

The protein of the invention may be prepared by culturing transformed host cells 
under culture conditions suitable to express the recombinant protein. The resulting 
expressed protein may then be purified from such culture (z.e, from culture medium or 

25 cell extracts) using known purification processes, such as gel filtration and ion exchange 
chromatography. The purification of the protein may also include an affinity column 
containing agents which will bind to the protein; one or more column steps over such 
affinity resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA 
Sepharose™; one or more steps involving hydrophobic interaction chromatography using 

30 such resins as phenyl ether, butyl ether, or propyl ether; or immunoaffinity 
chromatography. 
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Alternatively, the protein of the invention may also be expressed in a form which 
will facilitate purification. For example, it may be expressed as a fusion protein, such as 
those of maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin 
(TRX), or as a His tag. Kits for expression and purification of such fusion proteins are 
5 commercially available from New England BioLab (Beverly, Mass.), Pharmacia 

(Piscataway, NJ.) and Invitrogen, respectively. The protein can also be tagged with an 
epitope and subsequently purified by using a specific antibody directed to such epitope. 
One such epitope ("FLAG®") is commercially available from Kodak (New Haven, 
Conn.). 

10 Finally, one or more reverse-phase high performance liquid chromatography (RP- 

HPLC) steps employing hydrophobic RP-HPLC media, silica gel having pendant 
methyl or other aliphatic groups, can be employed to further purify the protein. Some or 
all of the foregoing purification steps, in various combinations, can also be employed to 
provide a substantially homogeneous isolated recombinant protein. The protein thus 

15 purified is substantially free of other mammalian proteins and is defined in accordance 
with the present invention as an "isolated protein." 

The polypeptides of the invention include analogs (variants). This embraces 
fragments, as well as peptides in which one or more amino acids has been deleted, 
inserted, or substituted. Also, analogs of the polypeptides of the invention embrace 

20 fusions of the polypeptides or modifications of the polypeptides of the invention, wherein 
the polypeptide or analog is fused to another moiety or moieties, e.g., targeting moiety or 
another therapeutic agent. Such analogs may exhibit improved properties such as activity 
and/or stability. Examples of moieties which may be fused to the polypeptide or an 
analog include, for example, targeting moieties which provide for the delivery of 

25 polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, antibodies to immune 
cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well as receptor 
and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for 
example, immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 

30 antibodies and steroids. Also, polypeptides may be fused to immune modulators, and 
other cytokines such as alpha or beta interferon. 
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3.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE 
IDENTITY AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match 
5 between the sequences tested. Methods to determine identity and similarity are codified 
in computer programs including, but are not limited to, the GCG program package, 
including GAP (Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics 
Computer Group, University of Wisconsin, Madison, WI), BLASTP, BLASTN, 
BLASTX, FASTA (Altschul, S.F. et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST 

10 (Altschul S.F. et al., Nucleic Acids Res. vol. 25, pp. 3389-3402, herein incorporated by 
reference), eMatrix software (Wu et al., J. Comp. Biol, Vol. 6, pp. 219-235 (1999), 
herein incorporated by reference), eMotif software (Nevill-Manning et al, ISMB-97, Vol. 
4, pp. 202-209, herein incorporated by reference), pFam software (Sonnhammer et al.. 
Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by reference) 

15 and the Kyte-Doolittle hydrophobocity prediction algorithm (J, Mol Biol, 157, pp. 105-31 
(1982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources 
(BLAST Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., 
et al„ J. Mol. Biol. 215:403^10 (1990). 

20 3.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a 

"chimeric protein" or "fusion protein" comprises a polypeptide of the invention 

operatively linked to another polypeptide. Within a fusion protein the polypeptide 

according to the invention can correspond to all or a portion of a protein according to the 

25 invention. In one embodiment, a fusion protein comprises at least one biologically active 
portion of a protein according to the invention. In another embodiment, a fusion protein 
comprises at least two biologically active portions of a protein according to the invention. 
Within the fusion protein, the term "operatively linked" is intended to indicate that the 
polypeptide according to the invention and the other polypeptide are fused in-frame to 

30 each other. The polypeptide can be fused to the N-terminus or C-terminus, or to the 
middle. 



39 



wo 02/074961 



PCTAJS02/05109 



For example, in one embodiment a fusion protein comprises a polypeptide 
according to the invention operably linked to the extracellular domain of a second 
protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., 
glutathione S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in 
which the polypeptide sequences according to the invention comprise one or more 
domains fused to sequences derived from a member of the immunoglobulin protein 
family. The inununoglobulin fusion proteins of the invention can be incorporated into 
pharmaceutical compositions and administered to a subject to inhibit an interaction 
between a ligand and a protein of the invention on the surface of a cell, to thereby 
suppress signal transduction in vivo. The immunoglobulin fusion proteins can be used to 
affect the bioavailability of a cognate ligand. Inhibition of the ligand/protein interaction 
may be useful therapeutically for both the treatment of proliferative and differentiative 
disorders, e.g., cancer as well as modulating promoting or inhibiting) cell survival. 
Moreover, the inununoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 
to identify molecules that inhibit the interaction of a polypeptide of the invention with a 
ligand. 

A chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, eg., by employing blunt-ended or stagger-ended termini for ligation, 
restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends 
as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and 
enzymatic ligation. In another embodiment, the fusion gene can be synthesized by 
conventional techniques including automated DNA synthesizers. Alternatively, PGR 
amplification of gene fragments can be carried out using anchor primers that give rise to 
complementary overhangs between two consecutive gene fragments that can 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John 
Wiley & Sons, 1992). Moreover, many expression vectors are commercially available 
that already encode a fusion moiety (e.g,, a GST polypeptide). A nucleic acid encoding a 
polypeptide of the invention can be cloned into such an expression vector such that the 
5 fusion moiety is linked in-frame to the protein of the invention. 

3.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of 
normal function of the encoded protein. The invention thus provides gene therapy to 

10 restore normal activity of the polypeptides of the invention; or to treat disease states 
involving polypeptides of the invention. Delivery of a functional gene encoding 
polypeptides of the invention to appropriate cells is effected ex vivo, in situ, or in vivo by 
use of vectors, and more particularly viral vectors (e.g., adenovirus, adeno-associated 
virus, or a retrovirus), or ex vivo by use of physical DNA transfer methods (e.g., 

15 liposomes or chemical treatments). See, for example, Anderson, Nature, supplement to 
vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of gene therapy technology 
see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific American: 68-84 
(1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of the 
nucleotides of the present invention or a gene encoding the polypeptides of the present 

20 invention can also be accomplished with extrachromosomal substrates (transient 

expression) or artificial chromosomes (stable expression). Cells may also be cultured ex 
vivo in the presence of proteins of the present invention in order to proliferate or to 
produce a desired effect on or activity in such cells. Treated cells can then be introduced 
in vivo for therapeutic purposes. Alternatively, it is contemplated that in other human 

25 disease states, preventing the expression of or inhibiting the activity of polypeptides of 
the invention will be useful in treating the disease states. It is contemplated that antisense 
therapy or gene therapy could be applied to negatively regulate the expression of 
polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of 

30 antisense molecules to the nucleic acids of the present invention, their complements, or their 
translated RNA sequences, by methods known in the art. Further, the polypeptides of the 
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present invention can be inhibited by using targeted deletion methods, or the insertion of a 
negative regulatory element such as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to 
express the polynucleotides of the invention, wherein such polynucleotides are in operative 

5 association with a regulatory sequence heterologous to the host cell which drives expression 
of the polynucleotides in the cell. These methods can be used to increase or decrease the 
expression of the polynucleotides of the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of 
cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be 

1 0 modified (e.g., by homologous recombination) to provide increased polypeptide expression 
by replacing, in whole or in part, the naturally occurring promoter with all or part of a 
heterologous promoter so that the cells express the protein at higher levels. The heterologous 
promoter is inserted in such a manner that it is operatively linked to the desired protein 
encoding sequences. See, for example, PCT Intemational Publication No. WO 94/12650, 

1 5 PCT Intemational Publication No. WO 92/20808, and PCT Intemational Publication No. 
WO 91/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfir, and the multifimctional CAD gene which encodes 
carbamyl phosphate synthase, aspartate transcarbamylase, and dihydrooiotase) and/or intron 
DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 

20 protein coding sequence, amplification of the marker DNA by standard selection methods 
results in co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered 
to express an endogenous gene comprising the polynucleotides of the invention under the 
control of inducible regulatory elements, in which case the regulatory sequences of the 

25 endogenous gene may be replaced by homologous recombination. As described herein, 
gene targeting can be used to replace a gene's existing regulatory region with a regulatory 
sequence isolated fipom a different gene or a novel regulatory sequence synthesized by 
genetic engineering methods. Such regulatory sequences may be comprised of promoters, 
enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional 

30 initiation sites, regulatory protein binding sites or combinations of said sequences. 
Alternatively, sequences which affect the structure or stability of the RNA or protein 
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produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader 
sequences for enhancing or modifying transport or secretion properties of the protein, or 
other sequences which alter or improve the function or stability of protein or RNA 
S molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, inserting a new promoter or 
enhancer or both upstream of a gene. Altematively, the targeting event may be a simple 
deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 

10 element. Altematively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different 
cell-type specificity than the naturally occunring elements. Here, the naturally occurring 
sequences are deleted and new sequences are added. In all cases, the identification of the 
targeting event may be facilitated by the use of one or more selectable marker genes that are 

1 5 contiguous with the targeting DNA, allowing for the selection of cells in which the 

exogenous DNA has integrated into the cell genome. The identification of the targeting 
event may also be facilitated by the use of one or more marker genes exhibiting the property 
of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting 

20 sequence, and such that a correct homologous recombination event with sequences in the 
host cell genome does not result in the stable integration of the negatively selectable marker. 
Markers usefiil for this purpose include the Herpes Simplex Virus thymidine kinase (TK) 
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance 

25 with this aspect of the invention are more particularly described in U.S. Patent No. 

5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application 
No. PCTAJS92/09627 (WO93/09222) by Selden et al.; and International Application No. 
PCTAJS90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by 
reference herein in its entirety. 

30 

3.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed 
or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the 
regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human mammals, can be prepared as described in U.S. Patent No. 
5,557,032, incorporated herein by reference. Transgenic animals are usefiil to determine 
the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compoxmds 
that modulate lipid metabolism. Transgenic animals, preferably non-human manunals, 
are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of 
expression of the polypeptides of the invention. Inactivation can be carried out using 
homologous recombination methods described above. Activation can be achieved by 
supplementing or even replacing the homologous promoter to provide for increased 
protein expression. The homologous promoter can be supplemented by insertion of one 
or more heterologous enhancer elements known to confer promoter activation in a 
particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to 
express polypeptides of the invention or that express a variant polypeptide. Such animals 
are useful as models for studying the in vivo activities of polypeptide as well as for 
studying modulators of the polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed 
or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the 
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regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human mammals, can be prepared as described in U.S. Patent No, 

5 5,557,032, incorporated herein by reference. Transgenic animals are useful to determine 
the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compounds 
that modulate lipid metabolism. Transgenic animals, preferably non-human manmials, 
are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

10 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of 
the invention promoter is either activated or inactivated to alter the level of expression of 
the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or 

15 even replacing the homologous promoter to provide for increased protein expression. The 
homologous promoter can be supplemented by insertion of one or more heterologous 
enhancer elements known to confer promoter activation in a particular tissue. 

3.10 USES AND BIOLOGICAL ACTIVITY 

20 The polynucleotides and proteins of the present invention are expected to exhibit 

one or more of the uses or biological activities (including those associated with assays 
cited herein) identified herein. Uses or activities described for proteins of the present 
invention may be provided by administration or use of such proteins or of 
polynucleotides encoding such proteins (such as, for example, in gene therapies or 

25 vectors suitable for introduction of DNA). The mechanism underlying the particular 
condition or pathology will dictate whether the polypeptides of the invention, the 
polynucleotides of the invention or modulators (activators or inhibitors) thereof would be 
beneficial to the subject in need of treatment. Thus, "therapeutic compositions of the 
invention" include compositions comprising isolated polynucleotides (including 

30 recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and 
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truncations or domains thereoQ, or compounds and other substances that modulate the 
overall activity of the target gene products, either at the level of target gene/protein 
expression or target protein activity. Such modulators include polypeptides, analogs, 
(variants), including fragments and fusion proteins, antibodies and other binding proteins; 

5 chemical compounds that directly or indirectly activate or inhibit the polypeptides of the 
invention (identified, e.g., via drug screening assays as described herein); antisense 
polynucleotides and polynucleotides suitable for triple helix formation; and in particular 
antibodies or other binding partners that specifically recognize one or more epitopes of 
the polypeptides of the invention. 

10 The polypeptides of the present invention may likewise be involved in cellular 

activation or in one of the other physiological pathways described herein. 

3.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the 

1 S research community for various purposes. The polynucleotides can be used to express 
recombinant protein for analysis, characterization or therapeutic use; as markers for 
tissues in which the corresponding protein is preferentially expressed (either 
constitutively or at a particular stage of tissue differentiation or development or in disease 
states); as molecular weight markers on gels; as chromosome markers or tags (when 

20 labeled) to identify chromosomes or to map related gene positions; to compare with 

endogenous DNA sequences in patients to identify potential genetic disorders; as probes 
to hybridize and thus discover novel, related DNA sequences; as a source of information 
to derive PGR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and 

25 making oligomers for attachment to a "gene chip" or other support, including for 
examination of expression patterns; to raise anti-protein antibodies using DNA 
immunization techniques; and as an antigen to raise anti-DNA antibodies or elicit another 
immune response. Where the polynucleotide encodes a protein which binds or 
potentially binds to another protein (such as, for example, in a receptor-ligand 

30 interaction), the polynucleotide can also be used in interaction trap assays (such as, for 
example, that described in Gyuris et al.. Cell 75:791-803 (1993)) to identify 
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polynucleotides encoding the other protein with which binding occurs or to identify 
inhibitors of the binding interaction. 

The polypeptides provided by the present invention can similarly be used in 
assays to determine biological activity, including in a panel of multiple proteins for 

5 high-throughput screening; to raise antibodies or to elicit another immune response; as a 
reagent (including the labeled reagent) in assays designed to quantitatively deteraiine 
levels of the protein (or its receptor) in biological fluids; as markers for tissues in which 
the corresponding polypeptide is preferentially expressed (either constitutively or at a 
particular stage of tissue differentiation or development or in a disease state); and, of 

10 course, to isolate correlative receptors or ligands. Proteins involved in these binding 

interactions can also be used to screen for peptide or small molecule inhibitors or agonists 
of the binding interaction. 

Any or all of these research utihties are capable of being developed into reagent 
grade or kit format for commercialization as research products. 

15 Methods for performing the uses listed above are well known to those skilled in 

the art. References disclosing such methods include without limitation "Molecular 
Cloning: A Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, 
Sambrook, J., E. F. Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: 
Guide to Molecular Cloning Techniques", Academic Press, Berger, S. L. and A. R. 

20 Kimmel eds., 1987. 

3J0.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as 
nutritional sources or supplements. Such uses include without limitation use as a protein or 

25 amino acid supplement, use as a carbon source, use as a nitrogen source and use as a source 
of carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be 
added to the feed of a particular organism or can be administered as a separate solid or liquid 
preparation, such as in the form of powder, pills, solutions, suspensions or capsules. In the 
case of microorganisms, the polypeptide or polynucleotide of the invention can be added to 

30 the medium in or on which the microorganism is cultured. 
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CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 

ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, 
cell proliferation (either inducing or inhibiting) or cell differentiation (either inducing or 

5 inhibiting) activity or may induce production of other cytokines in certain cell 

populations. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Many protein factors discovered to date, including all known cytokines, have 
exhibited activity in one or more factor-dependent cell proliferation assays, and hence the 
assays serve as a convenient confirmation of cytokine activity. The activity of therapeutic 

10 compositions of the present invention is evidenced by any one of a number of routine 

factor dependent cell proliferation assays for cell lines including, without limitation, 32D, 
DA2, DAIG, TIO, B9, B9/1 1, BaF3, MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, 
Tl 165, HT2, CTLL2, TF-1. Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions 
of the invention can be used in the following: 

15 Assays for T-cell or thymocyte proliferation include without limitation those 

described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol 137:3494-3500, 

20 1986; Bertagnolli et al., J. Immunol. 145:1706-1712, 1990; BertagnoUi et al, Cellular 
Immunology 133:327-341, 1991; Bertagnolli, et al, L Inmiunol 149:3778-3783, 1992; 
Bowman et al, L Immunol 152:17564761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node 
cells or thymocytes include, without limitation, those described in: Polyclonal T cell 

25 stimulation, Kruisbeek, A. M. and Shevach. E. M. In Current Protocols in Immunology, 
J. E. e.a. Coligan eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and 
Measurement of mouse and human interleukin-y. Schreiber, R. D. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 
1994. 

30 Assays for proliferation and differentiation of hematopoietic and lymphopoietic 

cells include, without limitation, those described in: Measurement of Human and Murine 
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Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current 
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and 
Sons, Toronto. 1991; deVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al.. 
Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 

5 80:2931-2938, 1983; Measurement of mouse and human interleukin 6-Nordan, R. In 
Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley 
and Sons, Toronto. 1991; Smith et al, Proc. Natl. Aced, Sci. U.S.A. 83:1857-1861, 1986; 
Measurement of human Interleukin 1 1-Bennett, F., Giannotti, J., Clark, S. C. and Turner, 
K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John 

10 Wiley and Sons, Toronto. 1991 ; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in 
Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, 
proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring 

15 proliferation and cytokine production) include, without limitation, those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 
6, Cytokines and their cellular receptors; Chapter 7, Immunologic studies in Humans); 

20 Weinberger et al, Proc. Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., 
Eur. J. Immun. 11:405-411, 1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai 
etal.,J, Inununol. 140:508-512, 1988. 

3.10.4 STEM CELL GROWTH FACTOR ACTIVITY 
25 A polypeptide of the present invention may exhibit stem cell growth factor 

activity and be involved in the prohferation, differentiation and survival of pluripotent 
and totipotent stem cells including primordial germ cells, embryonic stem cells, 
hematopoietic stem cells and/or germ line stem cells. Administration of the polypeptide 
of the invention to stem cells in vivo or ex vivo is expected to maintain and expand cell 
30 populations in a totipotential or pluripotential state which would be useful for re- 
engineering damaged or diseased tissues, transplantation, manufacture of bio- 
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pharmaceuticals and the development of bio-sensors. The ability to produce large 
quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, 
implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other 

5 neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, 
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, 
gastrointestinal cells and others; and organs for transplantation such as kidney, liver, 
pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or 

10 cytokines may be administered in combination with the polypeptide of the invention to 
achieve the desired effect, including any of the growth factors listed herein, other stem 
cell maintenance factors, and specifically including stem cell factor (SCF), leukemia 
inhibitory factor (LIF), Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble 
IL-6 receptor fused to IL-6, macrophage inflammatory protein 1 -alpha (MIP-1 -alpha), G- 

15 CSF, GM-CSF, thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth 
factor (PDGF), neural growth factors and basic fibroblast growth factor (bFGF), 
Since totipotent stem cells can give rise to virtually any mature cell type, 
expansion of these cells in culture will facilitate the production of large quantities of 
mature cells. Techniques for culturing stem cells are known in the art and administration 

20 of polypeptides of the invention, optionally with other growth factors and/or cytokines, is 
expected to enhance the survival and proliferation of the stem cell populations. This can 
be accomplished by direct administration of the polypeptide of the invention to the 
culture medium. Alternatively, stroma cells transfected with a polynucleotide that 
encodes for the polypeptide of the invention can be used as a feeder layer for the stem 

25 cell populations in culture or in vivo. Stromal support cells for feeder layers may include 
embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to 
induce autocrine expression of the polypeptide of the invention. This will allow for 

30 generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as 
is or that can then be differentiated into the desired mature cell types. These stable cell 
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lines can also serve as a source of undifferentiated totipotential/pluripotential mRNA to 
create cDNA libraries and templates for polymerase chain reaction experiments. These 
studies would allow for the isolation and identification of differentially expressed genes 
in stem cell populations that regulate stem cell proliferation and/or maintenance, 

5 Expansion and maintenance of totipotent stem cell populations will be useful in 

the treatment of many pathological conditions. For example, polypeptides of the present 
invention may be used to manipulate stem cells in culture to give rise to neuroepithelial 
cells that can be used to augment or replace cells damaged by ilbiess, autoimmune 
disease, accidental damage or genetic disorders. The polypeptide of the invention may be 

10 useful for inducing the proHferation of neural cells and for the regeneration of nerve and 
brain tissue, i.e. for the treatment of central and peripheral nervous system diseases and 
neuropathies, as well as mechanical and traumatic disorders which involve degeneration, 
death or trauma to neural cells or nerve tissue. In addition, the expanded stem cell 
populations can also be genetically altered for gene therapy purposes and to decrease host 

1 5 rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also 
be manipulated to achieve controlled differentiation of the stem cells into more 
differentiated cell types. A broadly applicable method of obtaining pure populations of a 
specific differentiated cell type from undifferentiated stem cell populations involves the 

20 use of a cell-type specific promoter driving a selectable marker. The selectable marker 
allows only cells of the desired type to survive. For example, stem cells can be induced 
to differentiate into cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); 
Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. 
W. In: Principles of Tissue Engineering eds. Lanza et al. Academic Press (1997)). 

25 Alternatively, directed differentiation of stem cells can be accomplished by culturing the 
stem cells in the presence of a differentiation factor such as retinoic acid and an 
antagonist of the polypeptide of the invention which would inhibit the effects of 
endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the 

30 invention exhibits stem cell growth factor activity. Stem cells are isolated from any one 
of various cell sources (including hematopoietic stem cells and embryonic stem cells) and 
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cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A.. 
92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in 
combination with other growth factors or cytokines. The ability of the polypeptide of the 
invention to induce stem cells proliferation is determined by colony formation on semi- 
5 solid support e.g. as described by Bernstein et aL, Blood, 77: 23 16-2321 (1991). 

3,10.S HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of 
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell disorders. 

1 0 Even marginal biological activity in support of colony forming cells or of 

factor-dependent cell lines indicates involvement in regulating hematopoiesis, e.g. in 
supporting the growth and proliferation of erythroid progenitor cells alone or in 
combination with other cytokines, thereby indicating utility, for example, in treating 
various anemias or for use in conjunction with irradiation/chemotherapy to stimulate the 

15 production of erythroid precursors and/or erythroid cells; in supporting the growth and 
proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to 
prevent or treat consequent myelo-suppression; in supporting the growth and proliferation 
of megakaryocytes and consequently of platelets thereby allowing prevention or 

20 treatment of various platelet disorders such as thrombocytopenia, and generally for use in 
place of or complimentary to platelet transfusions; and/or in supporting the growth and 
proliferation of hematopoietic stem cells which are capable of maturing to any and all of 
the above-mentioned hematopoietic cells and therefore find therapeutic utility in various 
stem cell disorders (such as those usually treated with transplantation, including, without 

25 limitation, aplastic anemia and paroxysmal nocturnal hemoglobinuria), as well as in 
repopulating the stem cell compartment post irradiation/chemotherapy, either in-vivo or 
ex-vivo (i.e., in conjunction with bone marrow transplantation or with peripheral 
progenitor cell transplantation (homologous or heterologous)) as normal cells or 
genetically manipulated for gene therapy. 

30 Therapeutic compositions of the invention can be used in the following: 
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Suitable assays for proliferation and differentiation of various hematopoietic lines 
are cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without 

5 limitation, those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller 
et al. Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al, Blood 
81:2903-2915, 1993. 

Assays for stem cell survival and differentiation (which will identify, among 
others, proteins that regulate lympho-hematopoiesis) include, without limitation, those 

10 described in: Methylcellulose colony forming assays, Freshney, M. G. In Culture of 
Hematopoietic Cells. R. 1. Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New 
York, N.Y. 1994; Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-5911, 1992; 
Primitive hematopoietic colony forming cells with high proliferative potential, McNiece, 
I. K. and Briddell, R. A, In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol 

15 pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et al.. Experimental 

Hematology 22:353-359, 1994; Cobblestone area forming cell assay, Ploemacher, R. E. 
In Culture of Hematopoietic Cells. R. 1. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, 
Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of stromal 
cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 

20 Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term 
culture initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

3.10.6 TISSUE GROWTH ACTIVITY 

25 A polypeptide of the present invention also may be mvolved in bone, cartilage, 

tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing 
and tissue repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone 
growth in circumstances where bone is not normally formed, has application in the 

30 healing of bone fractures and cartilage damage or defects in humans and other animals. 
Compositions of a polypeptide, antibody, binding partner, or other modulator of the 
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invention may have prophylactic use in closed as well as open fracture reduction and also 
in the improved fixation of artificial joints. De novo bone formation induced by an 
osteogenic agent contributes to the repair of congenital, trauma induced, or oncologic 

resection induced craniofacial defects, and also is useful in cosmetic plastic surgery. i 

5 A polypeptide of this invention may also be involved in attracting bone-forming 

cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors 
of bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative 
disorders, or periodontal disease, such as through stimulation of bone and/or cartilage 
repair or by blocking inflammation or processes of tissue destruction (collagenase 

10 activity, osteoclast activity, etc.) mediated by inflammatory processes may also be 
possible using the composition of the invention. 

Another category of tissue regeneration activity that may involve the polypeptide 
of the present invention is tendon/ligament formation. Induction of tendon/ligament-like 
tissue or other tissue formation in circumstances where such tissue is not normally 

IS formed, has application in the healing of tendon or ligament tears, deformities and other 
tendon or ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use- in preventing 
damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or 
ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. 

20 De novo tendon/ligament-like tissue formation induced by a composition of the present 
invention contributes to the repair of congenital, tramna induced, or other tendon or 
ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or Ugaments. The compositions of the present invention 
may provide environment to attract tendon- or ligament-forming cells, stimulate growth 

25 of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo 
for return in vivo to effect tissue repair. The compositions of the invention may also be 
usefiil in the treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament 
defects. The compositions may also include an appropriate matrix and/or sequestering 

30 agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of 
neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and 
traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve 

5 tissue. More specifically, a composition may be used in the treatment of diseases of the 
peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases, such as Alzheimer*s, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager 
syndrome. Further conditions which may be treated in accordance with the present 

10 invention include mechanical and traumatic disorders, such as spinal cord disorders, head 
trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies resuUing 
from chemotherapy or other medical therapies may also be treatable using a composition 
of the invention. 

Compositions of the invention may also be useful to promote better or faster 
1 5 closure of non-healing wounds, including without limitation pressure ulcers, ulcers 
associated with vascular insufficiency, surgical and traumatic woimds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, 
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular 
20 (including vascular endothelium) tissue, or for promoting the growth of cells comprising 
such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic 
scarring may allow normal tissue to regenerate. A polypeptide of the present invention 
may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
25 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, 
and conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or 
inhibiting differentiation of tissues described above from precursor tissues or cells; or for 
inhibiting the growth of tissues described above. 
30 Therapeutic compositions of the invention can be used in the following: 
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Assays for tissue generation activity include, without limitation, those described 
in: International Patent Pubhcation No. WO95/16035 (bone, cartilage, tendon); 
International Patent Publication No. WO95/05846 (nerve, neuronal); International Patent 
Publication No. WO91/07491 (skin, endothelium). 
5 Assays for wound healing activity include, without limitation, those described in: 

Winter, Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. L and Rovee, D. T., eds,), 
Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. 
Invest. Dermatol 71:382-84 (1978). 

10 3.10J IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or 
immune suppressing activity, including without limitation the activities for which assays 
are described herein. A polynucleotide of the invention can encode a polypeptide 
exhibiting such activities. A protein may be useful in the treatment of various immune 

1 5 deficiencies and disorders (including severe combined inununodeficiency (SCID)), e.g., 
in regulating (up or down) growth and proliferation of T and/or B lymphocytes, as well as 
effecting the cytolytic activity of NK cells and other cell populations. These immune 
deficiencies may be genetic or be caused by viral (e.g., HIV) as well as bacterial or 
fimgal infections, or may result firom autoimmune disorders. More specifically, infectious 

20 diseases causes by viral, bacterial, fimgal or other infection may be treatable using a 
protein of the present invention, including infections by HIV, hepatitis viruses, herpes 
viruses, mycobacteria, Leishmania spp., malaria spp. and various fimgal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be 
usefiil where a boost to the immune system generally may be desirable, i.e., in the 

25 treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present 
invention include, for example, connective tissue disease, multiple sclerosis, systemic 
lupus erythematosus, rheumatoid arthritis, autoinunune pulmonary inflammation, 
Guillain-Barre syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, 

30 myasthenia gravis, graft-versus-host disease and autoinunune inflammatory eye disease. 
Such a protein (or antagonists thereof, including antibodies) of the present invention may 
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also to be useful in the treatment of allergic reactions and conditions (e.g., anaphylaxis, 
serum sickness, drug reactions, food allergies, insect venom allergies, mastocytosis, 
allergic rhinitis, hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic 
dermatitis, allergic contact dermatitis, erythema multiforme, Stevens-Johnson syndrome, 
S allergic conjunctivitis, atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant 
papillary conjunctivitis and contact allergies), such as asthma (particularly allergic 
asthma) or other respiratory problems. Other conditions, in which immune suppression is 
desired (including, for example, organ transplantation), may also be treatable using a 
protein (or antagonists thereof) of the present invention. The therapeutic effects of the 

10 polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo 
animals models such as the cumulative contact enhancement test (Lastbom et al., 
Toxicology 125: 59-66, 1998), skin prick test (Hoffimann et al., Allergy 54: 446-54, 
1999), guinea pig skin sensitization test (Vohr et al.. Arch. Toxocol. 73: 501-9), and 
murine local lymph node assay (Kimber et al., J, Toxicol. Environ. Health 53: 563-79). . 

15 Using the proteins of the invention it may also be possible to modulate immune 

responses, in a number of ways. Down regulation may be in the form of inhibiting or 
blocking an immune response already in progress or may involve preventing the 
induction of an immune response. The functions of activated T cells may be inhibited by 
suppressing T cell responses or by inducing specific tolerance in T cells, or both. 

20 Immunosuppression of T cell responses is generally an active, non-antigen-specific, 
process which requires continuous exposure of the T cells to the suppressive agent. 
Tolerance, which involves inducing non-responsiveness or anergy in T cells, is 
distinguishable from* immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 

25 demonstrated by the lack of a T cell response upon reexposure to specific antigen in the 
absence of the tolerizing agent 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing 
high level lymphokme synthesis by activated T cells, will be useful in situations of tissue, 

30 skin and organ transplantation and in graft-versus-host disease (GVHD). For example, 
blockage of T cell function should result in reduced tissue destruction in tissue 
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transplantation. Typically, in tissue transplants, rejection of the transplant is initiated 
through its recognition as foreign by T cells, followed by an immune reaction that 
destroys the transplant. The administration of a therapeutic composition of the invention 
may prevent cytokine synthesis by immune cells, such as T cells, and thus acts as an 

5 immunosuppressant. Moreover, a lack of costimulation may also be sufficient to anergize 
the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance by B 
lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a 
subject, it may also be necessary to block the function of a combination of B lymphocyte 

10 antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac 
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been 

1 5 used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as 
described in Lenschow et al.. Science 257:789-792 (1992) and Turka et al., Proc. Natl. 
Acad. Sci USA, 89: 1 1102-1 1 105 (1992). In addition, murine models of GVHD (see Paul 
ed.. Fundamental Immunology, Raven Press, New York, 1989, pp. 846-847) can be used 
to determine the effect of therapeutic compositions of the invention on the development 

20 of that disease. 

^ Blocking antigen function may also be therapeutically useful for treating 

autoimmune diseases. Many autoimmune disorders are the result of inappropriate 
activation of T cells that are reactive against self tissue and which promote the production 
of cytokines and autoantibodies involved in the pathology of the diseases. Preventing the 

25 activation of autoreactive T cells may reduce or eliminate disease symptoms. 

Administration of reagents which block stimulation of T cells can be used to inhibit T 
cell activation and prevent production of autoantibodies or T cell-derived cytokines 
which may be involved in the disease process. Additionally, blocking reagents may 
induce antigen-specific tolerance of autoreactive T cells which could lead to long-term 

30 relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal 
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models of human autoimmune diseases. Examples include murine experimental 
autoimmune encephalitis, systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB 
hybrid mice, murine autoimmune collagen arthritis, diabetes mellitus in NOD mice and 
BB rats, and murine experimental myasthenia gravis (see Paul ed., Fundamental 

5 Immunology, Raven Press, New York, 1989, pp. 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a 
means of up regulating inunune responses, may also be useful in therapy. Upregulation of 
immune responses may be in the form of enhancing an existing immune response or 
eliciting an initial immune response. For example, enhancing an immune response may 

10 be useful in cases of viral infection, including systemic viral diseases such as influenza, 
the common cold, and encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient 
by removing T cells from the patient, costimulating the T cells in vitro with viral 
antigen-pulsed APCs either expressing a peptide of the present invention or together with 

15 a stimulatory form of a soluble peptide of the present invention and reintroducing the in 
vitro activated T cells into the patient. Another method of enhancing anti- viral immune 
responses would be to isolate infected cells from a patient, transfect them with a nucleic 
acid encoding a protein of the present invention as described herein such that the cells 
express all or a portion of the protein on their surface, and reintroduce the transfected 

20 cells into the patient. The infected cells would now be capable of delivering a 
costimulatory signal to, and thereby activate, T cells in vivo. 

A polypeptide of the present invention may provide the necessary stimulation 
signal to T cells to induce a T cell mediated immune response against the transfected 
tumor cells, hi addition, tumor cells which lack MHC class I or MHC class 11 molecules, 

25 or which fail to reexpress sufScient mounts of MHC class I or MHC class II molecules, 
can be transfected with nucleic acid encoding all or a portion of (e.g., a 
cytoplasmic-domain truncated portion) of an MHC class I alpha chain protein and p2 
microglobulin protein or an MHC class n alpha chain protein and an MHC class II beta 
chain protein to thereby express MHC class I or MHC class II proteins on the cell 

30 surface. Expression of the appropriate class I or class II MHC in conjunction with a 

peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a 
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T cell mediated immune response against the transfected tumor cell. Optionally, a gene 
encoding an antisense construct which blocks expression of an MHC class II associated 
protein, such as the invariant chain, can also be cotransfected with a DNA encoding a 
peptide having the activity of a B lymphocyte antigen to promote presentation of tumor 

5 associated antigens and induce tumor specific immunity. Thus, the induction of a T cell 
mediated immune response in a human subject may be sufficient to overcome 
tiunor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured 
by the following methods: 

10 Suitable assays for thymocyte or splenocyte cytotoxicity include, without 

limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Margulies, E, M. Shevach, W. Strober, Pub. Greene Publishing 
Associates and Wiley-Literscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 
Function 3.1-3.19; Chapter 7, Inununologic studies in Humans); Herrmann et al, Proc. 

15 Natl. Acad. Sci. USA 78:2488-2492, 1981; Herrmann et al„ J. Immunol. 128:1968-1974, 
1982; Handa et aL, J. tamunol. 135:1564-1572, 1985; Takai et al., L Immunol. 
137:3494-3500, 1986; Takai et aL, J. Immunol. 140:508-512, 1988; Bowman et al., L 
Virology 61:1992-1998; Bertagnolli et aL, Cellular Immunology 133:327-341, 1991; 
Brown et aL, J. Immunol. 153:3079-3092, 1994. 

20 Assays for T-cell-dependent inununoglobulin responses and isotype switching 

(which will identify, among others, proteins that modulate T-cell dependent antibody 
responses and that affect Thl/Th2 profiles) include, without limitation, those described 
in: Maliszewski. J. InrmiunoL 144:3028-3033, 1990; and Assays for B cell fimction: In 
vitro antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in 

25 Inununology. J. E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, 
Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, 
protems that generate predominantly Thl and CTL responses) include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. 
30 Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing 

Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 
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Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; BertagnoUi et al., J. 
Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins 
5 expressed by dendritic cells that activate naive T-cells) include, without limitation, those 
described in: Guery et al, J. Inmiunol. 134:536-544, 1995; Inaba et al.. Journal of 
Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of Immunology 
154:5071-5079, 1995; Porgador et al.. Journal of Experimental Medicine 182:255-260, 
1995; Nair et al. Journal of Virology 67:4062-4069, 1993; Huang et al. Science 

10 264:961-965, 1994; Macatonia et al, Journal of Experimental Medicine 169:1255-1264, 
1989; Bhardwaj et al. Journal of Clinical Investigation 94:797*807, 1994; and Inaba et 
al. Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, 
proteins that prevent apoptosis after superantigen induction and proteins that regulate 

15 lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz 
et al. Cytometry 13:795-808, 1992; Gorczyca et al. Leukemia 7:659-670, 1993; 
Gorczycaetal, Cancer Research 53:1945-1951, 1993; Itohetal, Cell 66:233-243, 1991; 
Zacharchuk, Journal of Immunology 145:4037-4045, 1990; Zamai et al, Cytometry 
14:891-897, 1993; Gorczyca et al. International Journal of Oncology 1:639-648, 1992. 

20 Assays for proteins that influence early steps of T-cell commitment and 

development include, without limitation, those described in: Antica et al, Blood 
84:1 1 l-l 17, 1994; Fine et al. Cellular Immunology 155:1 1 1-122, 1994; Galy et al, 
Blood 85:2770-2778, 1995; Toki et al, Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

25 3.10.8 ACTIVIN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to 
30 stimulate the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the 
present invention, alone or in heterodimers with a member of the inhibin family, may be 
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useful as a contraceptive based on the ability of inhibins to decrease fertility in female 
mammals and decrease spermatogenesis in male mammals. Administration of sufficient 
amounts of other inhibins can induce infertility in these mammals. Alternatively, the 
polypeptide of the invention, as a homodimer or as a heterodimer with other protein 

5 subunits of the inhibin group, may be useful as a fertility inducing therapeutic, based 
upon the ability of activin molecules in stimulating FSH relealse from cells of the anterior 
pituitary. See, for example, U.S. Pat. No. 4,798,885. A polypeptide of the invention may 
also be useful for advancement of the onset of fertility in sexually immature mammals, so 
as to increase the lifetime reproductive performance of domestic animals such as, but not 

1 0 limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be 
measured by the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: 
Vale et al., Endocrinology 91 :562-572, 1972; Ling et al., Nature 321 :779-782, 1986; Vale 

15 et al, Nature 321 :776-779, 1986; Mason et al, Nature 318:659-663, 1985; Forage et al., 
Proc. Natl. Acad. Sci. USA 83:3091-3095, 1986. 

3.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or 
20 chemokinetic activity for mammalian cells, including, for example, monocytes, 

fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 
cells. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Chemotactic and chemokinetic receptor activation can be used to mobilize or 
attract a desired cell population to a desired site of action. Chemotactic or chemokinetic 
25 compositions (e.g. proteins, antibodies, binding partners, or modulators of the invention) 
provide particular advantages in treatment of wounds and other trauma to tissues, as well 
as in treatment of localized infections. For example, attraction of lymphocytes, 
monocytes or neutrophils to tumors or sites of infection may result in improved immune 
responses against the tumor or infecting agent. 
30 A protein or peptide has chemotactic activity for a particular cell population if it 

can stimulate, directly or indirectly, the directed orientation or movement of such cell 
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population. Preferably, the protein or peptide has the ability to directly stimulate directed 
movement of cells. Whether a particular protein has chemotactic activity for a population 
of cells can be readily determined by employing such protein or pq)tide in any known 
assay for cell chemotaxis. 

5 Therapeutic compositions of the invention can be used in the following: 

Assays for chemotactic activity (which will identify proteins that induce or 
prevent chemotaxis) consist of assays that measure the ability of a protein to induce the 
migration of cells across a membrane as well as the ability of a protein to induce the 
adhesion of one cell population to another cell population. Suitable assays for movement 

10 and adhesion include, without limitation, those described in: Current Protocols in 

Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. 
Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 6.12, 
Measurement of alpha and beta Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 
95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 1995; Muller et al Eur. J. 

15 Inununol 25:1744-1748; Gruber et al. J, of Immunol. 152:5860-5867, 1994; Johnston et 
al J. of Immunol. 153:1762-1768, 1994. 

3.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or 
20 thrombolysis or thrombosis. A polynucleotide of the invention can encode a polypeptide 
exhibiting such attributes. Compositions may be usefiil in treatment of various 
coagulation disorders (including hereditary disorders, such as hemophilias) or to enhance 
coagulation and other hemostatic events in treating wounds resulting from trauma, 
surgery or other causes. A composition of the invention may also be useful for dissolving 
25 or inhibiting formation of thromboses and for treatment and prevention of conditions 
resulting therefrom (such as, for example, infarction of cardiac and central nervous 
system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
30 described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al.. 
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Thrombosis Res. 45:413-419, 1987; Humphrey et a!., Fibrinolysis 5:71-79 (1991); 
Schaub, Prostaglandins 35:467-474, 1988. 

t > 3.10.11 CANCER DIAGNOSIS AND THERAPY 

5 Polypeptides of the invention may be involved in cancer cell generation, 

proliferation or metastasis. Detection of the presence or amount of polynucleotides or 
polypeptides of the invention may be useful for the diagnosis and/or prognosis of one or 
more types of cancer. For example, the presence or increased expression of a 
polynucleotide/polypeptide of the invention may indicate a hereditary risk of cancer, a 

10 precancerous condition, or an ongoing malignancy. Conversely, a defect in the gene or 
absence of the polypeptide may be associated with a cancCT condition. Identification of 
single nucleotide polymorphisms associated with cancer or a predisposition to cancer 
may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell 

15 proliferation, inhibiting angiogenesis (growth of new blood vessels that is necessary to 
support tumor growth) and/or prohibiting metastasis by reducing tumor cell motility or 
invasiveness. Therapeutic compositions of the invention may be effective in adult and 
pediatric oncology including in solid phase tumors/malignancies, locally advanced 
tumors, human soft tissue sarcomas, metastatic cancer, including lymphatic metastases, 

20 blood cell malignancies including multiple myeloma, acute and chronic leukemias, and 
lymphomas, head and neck cancers including mouth cancer, larynx cancer and thyroid 
cancer, lung cancers including small cell carcinoma and non-small cell cancers, breast 
cancers including small cell carcinoma and ductal carcinoma, gastrointestinal cancers 
including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

25 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers 
including bladder cancer and prostate cancer, malignancies of the female genital tract 
including ovarian carcinoma, uterine (including endometrial) cancers, and solid tumor in 
the ovarian follicle, kidney cancers including r^al cell carcinoma, brain cancers 
including intrinsic brain tumors, neuroblastoma, astrocytic brain tumors, gliomas, 

30 metastatic tumor cell invasion in the central nervous system, bone cancers including 

osteomas, skin cancers including malignant melanoma, tumor progression of human skin 
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keratinocytes, squamous cell carcinoma, basal cell carcinoma, hemangiopericytoma and 
Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention 
(including inhibitors and stimulators of the biological activity of the polypeptide of the 

S invention) may be administered to treat cancer. Therapeutic compositions can be 

administered in therapeutically effective dosages alone or in combination with adjuvant 
cancer ther^y such as surgery, chemotherapy, radiotherapy, thermotherapy, and laser 
ther^y, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of 
tumor growth, inhibiting metastasis, or otherwise improving overall clinical condition, 

1 0 without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as 
a portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the 
polypeptide or modulator of the invention with one or more anti*cancer drugs in addition 
to a phannaceutically acceptable carrier for delivery. The use of anti-cancer cocktails as 

15 a cancer treatment is routine. Anti-cancer drugs that are well known in the art and can be 
used as a treatment in combination with the polypeptide or modulator of the invention 
include: Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, 
Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, 
Cytarabine HCl (Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HCl, 

20 Doxorubicin HCl, Estramustine phosphate sodium, Etoposide (VI 6-2 13), Floxuridine, 5- 
Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon 
Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HCl (nitrogen mustard), Melphalan, Mercaptopurine, 
Mesna, Methotrexate (MTX), Mitomycin, Mitoxantrone HCl, Octreotide, Plicamycin, 

25 Procarbazine HCl, Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine 
sulfate. Vincristine sulfate, Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, 
Mitoguazone, Pentostatin, Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for 
prophylactic treatment of cancer. There are hereditary conditions and/or environmental 

30 situations (e.g. exposure to carcinogens) known in the art that predispose an individual to 
developing cancers. Under these circumstances, it may be beneficial to treat these 
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individuals with therapeutically effective doses of the polypeptide of the invention to 

reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of 

the invention as a potential cancer treatment. These in vitro models include proliferation 
5 assays of cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, 

(1987) Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, 

NY Ch 1 8 and Ch 21), tumor systems in nude mice as described in Giovanella et al., J. 

Natl. Can. Inst., 52: 921-30 (1974), mobility and invasive potential of tumor cells in 

Boyden Chamber assays as described in Pilkington et al, Anticancer Res., 17: 4107-9 
10 (1997), and angiogenesis assays such as induction of vascularization of the chick 

chorioallantoic membrane or induction of vascular endothelial cell migration as described 

in Ribatta et al., Intl. J. Dev. Biol., 40: 1 189-97 (1999) and Li et al., Clin. Exp. 

Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, e.g. 

from American Type Tissue Culture Collection catalogs. 

15 

3.10-12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/Iigand interactions. A polynucleotide 
of the invention can encode a polypeptide exhibiting such characteristics. Examples of 

20 such receptors and ligands include, without limitation, cytokine receptors and their 
ligands, receptor kinases and their ligands, receptor phosphatases and their ligands, 
receptors involved in cell-cell interactions and their ligands (including without limitation, 
cellular adhesion molecules (such as selectins, integrins and their ligands) and 
receptor/Iigand pairs involved in antigen presentation, antigen recognition and 

25 development of cellular and humoral immune responses. Receptors and ligands are also 
useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/Iigand interaction. A protein of the present invention (including, without 
limitation, fragments of receptors and ligands) may themselves be useful as inhibitors of 
receptor/Iigand interactions. 

30 The activity of a polypeptide of the invention may, among other means, be 

measured by the following methods: 
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Suitable assays for receptor-ligand activity include without limitation those 
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley- Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static 

5 conditions 7.28.1- 7.28.22), Takai et al, Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987; 
Bierer et al., J. Exp. Med. 168:1 145-1 156, 1988; Rosenstein et al., J. Exp. Med. 
169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al.. 
Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor 

10 for a ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may 
be identified through binding assays, affinity chromatography, dihybrid screening assays, 
BIAcore assays, gel overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists 
or a partial antagonist require the use of other proteins as competing ligands. The 

1 5 polypeptides of the present invention or ligand(s) thereof may be labeled by being 
coupled to radioisotopes, colorimetric molecules or a toxin molecules by conventional 
methods. ("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in 
Enzymology Vol 182 (1990) Academic Press, Inc. San Diego). Examples of 
radioisotopes include, but are not limited to, tritium and carbon- 14 . Examples of 

20 colorimetric molecules include, but are not limited to, fluorescent molecules such as 
fluorescamine, or rhodamine or other colorimetric molecules. Examples of toxins 
include, but are not limited, to ricin. 

3.10.13 DRUG SCREENING 

25 This invention is particularly usefiil for screening chemical compounds by using 

the novel polypeptides or binding fi-agments thereof in any of a variety of drug screening 
techniques. The polypeptides or fragments employed in such a test may either be firee in 
solution, affixed to a solid support, home on a cell surface or located intracellularly. One 
method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably 

30 transformed with recombinant nucleic acids expressing the polypeptide or a fragment 
thereof. Drugs are screened against such transformed cells in competitive binding assays. 
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Such cells, either in viable or fixed form, can be used for standard binding assays. One 
may measure, for example, the formation of complexes between polypeptides of the 
invention or fragments and the agent being tested or examine the diminution in complex 
formation between the novel polypeptides and an appropriate cell line, which are well 

5 known in the art. 

Sources for test compounds that may be screened for ability to bind to or 
modulate (i.e., increase or decrease) the activity of polypeptides of the invention include 
(1) inorganic and organic chemical libraries, (2) natural product libraries, and (3) 
combinatorial libraries comprised of either random or mimetic peptides, oligonucleotides 

1 0 or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or 
compounds that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria 

1 5 and fiingi), animals, plants or other vegetation, or marine organisms, and libraries of 

mixtures for screening may be created by: (1) fermentation and extraction of broths from 
soil, plant or marine microorganisms or (2) extraction of the organisms themselves. 
Natural product libraries include polyketides, non-ribosomal peptides, and (non-naturally 
occurring) variants thereof. For a review, see Science 252:63-68 (1998). 

20 Combinatorial libraries are composed of large numbers of peptides, 

oligonucleotides or organic compounds and can be readily prepared by traditional 
automated synthesis methods, PCR, cloning or proprietary synthetic methods. Of 
particular interest are peptide and oligonucleotide combinatorial libraries. Still other 
libraries of interest include peptide, protein, peptidomimetic, multiparallel synthetic 

25 collection, recombinatorial, and polypeptide libraries. For a review of combinatorial 
chemistry and libraries created therefi^m, see Myers, Cwrr. Opin. Biotechnol 8:701-707 
(1997). For reviews and examples of peptidomimetic libraries, see Al-Obeidi et al., Mol 
Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 1(1):1 14-19 (1997); 
Domer et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

30 Identification of modulators through use of the various libraries described herein 

permits modification of the candidate "hit" (or "lead") to optimize the capacity of the 
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"hit" to bind a polypeptide of the invention. The molecules identified in the binding assay 
are then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
that are well known in the art. In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 

5 the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin 
or cholera, or with other compounds that are toxic to cells such as radioisotopes. The 
toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity 
of the binding molecule for a polypeptide of the invention. Alternatively, the binding 

1 0 molecules may be complexed with imaging agents for targeting and imaging purposes. 

3,10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide 
e.g. a ligand or a receptor. The art provides numerous assays particularly useful for 

1 5 identifying previously unknown binding partners for receptor polypeptides of the 
invention. For example, expression cloning using manunalian or bacterial cells, or 
dihybrid screening assays can be used to identify polynucleotides encoding-binding 
partners. As another example, affinity chromatography with the appropriate immobilized 
polypeptide of the invention can be used to isolate polypeptides that recognize and bind 

20 polypeptides of the invention. There are a number of different libraries used for the 
identification of compounds, and in particular small molecules, that modulate (z.e., 
increase or decrease) biological activity of a polypeptide of the invention. Ligands for 
receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical 

25 except for the expression of the receptor of the invention: one cell population expresses 
the receptor of the invention whereas the other does not. The response of the two cell 
populations to the addition of ligands(s) are then compared. Alternatively, an expression 
library can be co-expressed with the polypeptide of the invention in cells and assayed for 
an autocrine response to identify potential ligand(s). As still another example, BIAcore 

30 assays, gel overiay assays, or other methods known in the art can be used to identify 

binding partner polypeptides, including, (1) organic and inorganic chemical libraries, (2) 
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natural product libraries, and (3) combinatorial libraries comprised of random peptides, 
oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade 
of the polypeptide of the invention can be determined. For example, a chimeric protein in 

5 which the cytoplasmic domain of the polypeptide of the invention is fused to the 

extracellular portion of a protein, whose ligand has been identified, is produced in a host 
cell. The cell is then incubated with the ligand specific for the extracellular portion of the 
chimeric protein, thereby activating the chimeric receptor. Known downstream proteins 
involved in intracellular signaling can then be assayed for expected modifications i.e. 

10 phosphorylation. Other methods known to those in the art can also be used to identify 
signaling molecules involved in receptor activity. 

3.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory 

1 5 activity. The anti-inflanraiatory activity may be achieved by providing a stimulus to cells 
involved in the inflammatory response, by inhibiting or promoting cell-celt interactions 
(such as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells 
involved in the inflammatory process, inhibiting or promoting cell extravasation, or by 
stimulating or suppressing production of other factors which more directly inhibit or 

20 promote an inflammatory response. Compositions with such activities can be used to treat 
inflammatory conditions including chronic or acute conditions), including without 
limitation intimation associated with infection (such as septic shock, sepsis or systemic 
inflammatory response syndrome (SIRS)), ischemia-reperfiision injury, endotoxin 
lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 

25 chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting 
fi-om over production of cj^okines such as TNF or IL-1. Compositions of the invention 
may also be usefiil to treat anaphylaxis and hypersensitivity to an antigenic substance or 
material. Compositions of this invention may be utilized to prevent or treat conditions 
such as, but not limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced 

30 shock, rheumatoid arthritis, chronic inflammatory arthritis, pancreatic cell damage from 
diabetes mellitus type 1, graft versus host disease, inflanunatory bowel disease. 
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inflamation associated with pulmonary disease, other autoimmune disease or 
inflammatory disease, an antiproliferative agent such as for acute or chronic mylegenous 
leukemia or in the prevention of premature labor secondary to intrauterine infections. 

3,10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of 
a therapeutic that promotes or inhibits function of the polynucleotides and/or 
polypeptides of the invention. Such leukemias and related disorders include but are not 
limited to acute leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, 
myeloblastic, promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic 
leukemia, chronic myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia 
(for a review of such disorders, sec Fishman et al, 1985, Medicine, 2d Ed., J.B. 
Lippincott Co., Philadelphia). 

3.10a7 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication 
of therapeutic utility, include but are not limited to nervous system injuries, and diseases 
or disorders which result in either a disconnection of axons, a diminution or degeneration 
of neurons, or demyelination. Nervous system lesions which may be treated in a patient 
(including human and non-human mammalian patients) according to the invention 
include but are not limited to the following lesions of either the central (including spinal 
cord, brain) or peripheral nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated 
with surgery, for example, lesions which sever a portion of the nervous system, or 
compression injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous 
system results in neuronal injury or death, including cerebral infarction or ischemia, or 
spinal cord infarction or ischemia; 
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(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by 
human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme 
disease, tuberculosis, syphilis; 
5 (iv) degenerative lesions, in which a portion of the nervous system is destroyed 

or injured as a result of a degenerative process including but not limited to degeneration 
associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or 
amyotrophic lateral sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion 
10 of the nervous system is destroyed or injured by a nutritional disorder or disorder of 

metabolism including but not limited to, vitamin B12 deficiency, folic acid deficiency, 
Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not 
15 limited to diabetes (diabetic neuropathy. Bell's palsy), systemic lupus erythematosus, 

carcinoma, or sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is 

20 destroyed or injured by a demyelinating disease including but not limited to multiple 

sclerosis, human immunodeficiency virus-associated myelopathy, transverse myelopathy 
or various etiologies, progressive multifocal leukoencephalopathy, and central pontine 
myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a 
25 nervous system disorder may be selected by testing for biological activity in promoting 
the survival or differentiation of neurons. For example, and not by way of limitation, 
therapeutics which elicit any of the following effects may be useful accordmg to the 
invention: 

(i) increased survival time of neurons in culture; 
30 (ii) increased sprouting of neurons in culture or in vivo; 
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(iii) increased production of a neuron-associated molecule in culture or in vivo, 
choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

Such effects may be measured by any method known in the art. In preferred, 
5 non-limiting embodiments, increased survival of neurons may be measured by the 

method set forth in Arakawa et al (1990, J. Neurosci. 10:3507-35 15); increased sprouting 
of neurons may be detected by methods set forth in Pestronk et al (1980, Exp. Neurol. 
70:65-82) or Brown et al. (1981, Arm. Rev. Neurosci. 4:17-42); increased production of 
neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody 
10 binding, Northern blot assay, etc, depending on the molecule to be measured; and motor 
neuron dysfunction may be measured by assessing the physical manifestation of motor 
neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional 
disability. 

In specific embodiments, motor neuron disorders that may be treated according to 
15 the invention include but are not limited to disorders such as infarction, infection, 

exposure to toxin, trauma, surgical damage, degenerative disease or malignancy that may 
affect motor neurons as well as other components of the nervous system, as well as 
disorders that selectively affect neurons such as amyotrophic lateral sclerosis, and 
including but not limited to progressive spinal muscular atrophy, progressive bulbar 
20 palsy, primary lateral sclerosis, infantile and juvenile muscular atrophy, progressive 

bulbar paralysis of childhood (Fazio-Londe syndrome), poliomyelitis and the post polio 
syndrome, and Hereditary Motorsensory Neuropathy (Charcot-Marie-Tooth Disease). 

3.10.18 OTHER ACTIVITIES 

25 A polypeptide of the invention may also exhibit one or more of the following 

additional activities or effects: inhibiting the growth, infection or function of, or killing, 
infectious agents, including, without limitation, bacteria, viruses, fungi and other 
parasites; effecting (suppressing or enhancing) bodily characteristics, including, without 
limitation, height, weight, hair color, eye color, skin, fat to lean ratio or other tissue 

30 pigmentation, or organ or body part size or shape (such as, for example, breast 

augmentation or diminution, change in bone form or shape); effecting biorhythms or 

73 



wo 02/074961 



PCTAJS02/05109 



circadian cycles or rhythms; effecting the fertility of male or female subjects; effecting 
the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of 
dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other nutritional 
factors or component(s); effecting behavioral characteristics, including, without 

5 limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or 
other pain reducing effects; promoting differentiation and growth of embryonic stem cells 
in lineages other than hematopoietic lineages; hormonal or endocrine activity; in the case 
of enzymes, correcting deficiencies of the enzyme and treating deficiency-related 

10 diseases; treatment of hyperproliferative disorders (such as, for example, psoriasis); 
immunoglobulin-like activity (such as, for example, the ability to bind antigens or 
complement); and the ability to act as an antigen in a vaccine composition to raise an 
immune response against such protein or another material or entity which is 
cross-reactive with such protein. 

15 

3.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for 
diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential 

20 predisposition or susceptibility to various disease states (such as disorders involving 

inflammation or immune response) or a differential response to drug administration, and 
this genetic information can be used to tailor preventive or therapeutic treatment 
appropriately. For example, the existence of a polymorphism associated with a 
predisposition to inflammation or autoimmune disease makes possible the diagnosis of 

25 this condition in humans by identifying the presence of the polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample firom a patient, analyzing DNA from the sample, 
optionally involving isolation or amplification of the DNA, and identifying the presence 
of the polymorphism in the DNA. For example, PCR may be used to amplify an 

30 appropriate fragment of genomic DNA which may then be sequenced. Alternatively, the 
DNA may be subjected to allele-specific oligonucleotide hybridization (in which 
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appropriate oligonucleotides are hybridized to the DNA under conditions permitting 
detection of a single base mismatch) or to a single nucleotide extension assay (in which 
an oligonucleotide that hybridizes immediately adjacent to the position of the 
polymorphism is extended with one or more labeled nucleotides). In addition, traditional 
5 restriction fragment length polymorphism analysis (using restriction enzymes that 

provide differential digestion of the genomic DNA depending on the presence or absence 
of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences 
10 of the present invention. In the alternative, any one of the nucleotide sequences of the 
present invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence 
could also be detected by detecting a corresponding change in amino acid sequence of the 
protein, e.g., by an antibody specific to the variant sequence. 

15 

3,10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against 
rheumatoid arthritis is determined in an experimental animal model system. The 
experimental model system is adjuvant induced arthritis in rats, and the protocol is 

20 described by J. Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et al., 1963, 
Int. Arch. Allergy Appl. Immunol., 23:129. Induction of the disease can be caused by a 
single injection, generally intradermally, of a suspension of killed Mycobacterium 
tuberculosis in complete Freund's adjuvant (CFA). The route of injection can vary, but 
rats may be injected at the base of the tail with an adjuvant mixture. The polypeptide is 

25 administered in phosphate buffered solution (PBS) at a dose of about 1-5 mg/kg. The 
control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of 
intradermally injecting killed Mycobacterium tuberculosis in CFA followed by 
immediately administering the test compound and subsequent treatment every other day 

30 until day 24. At 14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an 
overall arthritis score may be obtained as described by J. Holoskitz above. An analysis of 
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the data would reveal that the test compound would have a dramatic affect on the 
swelling of the joints as measured by a decrease of the arthritis score. 

5 3.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and 
antibodies or other binding partners or modulators including antisense polynucleotides) 
of the invention have numerous applications in a variety of therapeutic methods. 
Examples of therapeutic applications include, but are not limited to, those exemplified 
10 herein. 

3.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of 
the polypeptides or other composition of the invention to individuals affected by a 

IS disease or disorder that can be modulated by regulating the peptides of the invention. 
While the mode of administration is not particularly important, parenteral administration 
is preferred. An exemplary mode of administration is to deliver an intravenous bolus. 
The dosage of the polypeptides or other composition of the invention will normally be 
determined by the prescribing physician. It is to be expected that the dosage will vary 

20 according to the age, weight, condition and response of the individual patient. Typically, 
the amount of polypeptide administered per dose will be in the range of about O.Ol^ig/kg 
to 100 mg/kg of body weight, with the preferred dose being about O.ljag/kg to 10 mg/kg 
of patient body weight. For parenteral administration, polypeptides of the invention will 
be formulated in an injectable form combined with a pharmaceutically acceptable 

25 parenteral vehicle. Such vehicles are well known in the art and examples include water, 
saline. Ringer's solution, dextrose solution, and solutions consisting of small amounts of 
the human serum albumin. The vehicle may contain minor amounts of additives that 
maintain the isotonicity and stability of the polypeptide or other active ingredient. The 
preparation of such solutions is within the skill of the art. 

30 
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3.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source 
derived, including without limitation from recombinant and non-recombinant sources and 
5 including antibodies and other binding partners of the polypeptides of the invention) may 
be administered to a patient in need, by itself, or in pharmaceutical compositions where it 
is mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a variety of 
disorders. Such a composition may optionally contain (in addition to protein or other 
active ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubihzers, and 

10 other materials well known in the art. The term "pharmaceutically acceptable" means a 
non-toxic material that does not interfere with the effectiveness of the biological activity 
of the active ingredient(s). The characteristics of the carrier will depend on the route of 
administration. The pharmaceutical composition of the invention may also contain 
cytokines, lymphokines, or other hematopoietic factors such as M-CSF, GM-CSF, TNF, 

15 IL-1, IL-2, IL.3, IL-4, IL-5, IL.6, IL-7, IL-8, IL-9, IL-10, IL-1 1, IL-12, IL-13, IL-14, 
IL-15. IFN, TNFO, TNFl, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell factor, 
and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These 
agents include various growth factors such as epidermal growth factor (EGF), 

20 platelet-derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-P), 
insulin-like growth factor (IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either 
enhance the activity of the protein or other active ingredient or complement its activity or 
use in treatment. Such additional factors and/or agents may be included in the 

25 pharmaceutical composition to produce a synergistic effect with protein or other active 
ingredient of the invention, or to minimize side effects. Conversely, protein or other 
active ingredient of the present invention may be included in formulations of the 
particular clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic 
or anti-thrombotic factor, or anti- inflammatory agent to minimize side effects of the 

30 clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or 

anti-thrombotic factor, or anti-inflammatory agent (such as IL-lRa, IL-1 Hyl, IL-1 Hy2, 
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anti-TNF, corticosteroids, immunosuppressive agents). A protein of the present 
invention may be active in multimers (e.g., heterodimers or homodimers) or complexes 
with itself or other proteins. As a result, pharmaceutical compositions of the invention 
may comprise a protein of the invention in such muhimeric or complexed form. 
5 As an alternative to being included in a pharmaceutical composition of the 

invention including a first protein, a second protein or a therapeutic agent may be 
concurrently administered with the first protein (e.g., at the same time, or at differing 
times provided that therapeutic concentrations of the combination of agents is achieved at 
the treatment site). Techniques for formulation and administration of the compounds of 

10 the instant application may be found in "Remington's Pharmaceutical Sciences," Mack 
Publishing Co., Easton, PA, latest edition. A therapeutically effective dose further refers 
to that amount of the compound sufficient to result in amelioration of symptoms, e.g,, 
treatment, healing, prevention or amelioration of the relevant medical condition, or an 
increase in rate of treatment, healing, prevention or amelioration of such conditions. 

15 When applied to an individual active ingredient, administered alone, a therapeutically 
effective dose refers to that ingredient alone. When applied to a combination, a 
therapeutically effective dose refers to combined amounts of the active ingredients that 
result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

20 In practicing the method of treatment or use of the present invention, a 

therapeutically effective amount of protein or other active ingredient of the present 
invention is administered to a mammal having a condition to be treated. Protein or other 
active ingredient of the present invention may be administered in accordance with the 
method of the invention either alone or in combination with other therapies such as 

25 treatments employing cytokines, lymphokines or other hematopoietic factors. When co- 
administered with one or more cytokines, lymphokines or other hematopoietic factors, 
protein or other active ingredient of the present invention may be administered either 
simultaneously with the cytokine(s), lymphokine(s), other hematopoietic factor(s), 
thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, the 

30 attending physician will decide on the appropriate sequence of administering protein or 
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Other active ingredient of the present invention in combination with cytokine(s), 
lymphokine(s), other hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

3.12.1 ROUTES OF ADMINISTRATION 

5 Suitable routes of administration may, for example, include oral, rectal, 

transmucosal, or intestinal administration; parenteral delivery, including intramuscular, 
subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, 
intravenous, intraperitoneal, intranasal, or intraocular injections. Administration of 
protein or other active ingredient of the present invention used in the pharmaceutical 

10 composition or to practice the method of the present invention can be carried out in a 
variety of conventional ways, such as oral ingestion, inhalation, topical application or 
cutaneous, subcutaneous, intraperitoneal, parenteral or intravenous injection. Intravenous 
administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic 

15 manner, for example, via injection of the compound directly into a arthritic joints or in 
fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the 
scarring process frequently occurring as complication of glaucoma surgery; the 
compounds may be administered topically, for example, as eye drops. Furthermore, one 
may administer the drug in a targeted drug delivery system, for example, in a liposome 

20 coated with a specific antibody, targeting, for example, arthritic or fibrotic tissue. The 
liposomes will be targeted to and taken up selectively by the afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an 
effective dosage to the desired site of action. The determination of a suitable route of 
administration and an effective dosage for a particular indication is within the level of 

25 skill in the art. Preferably for wound treatment, one administers the therapeutic 
compound directly to the site. Suitable dosage ranges for the poIypq)tides of the 
invention can be extrapolated from these dosages or from similar studies in appropriate 
animal models. Dosages can then be adjusted as necessary by the clinician to provide 
maximal therapeutic benefit. 

30 

3.12.2 COMPOSITIONS/FORMULATIONS 
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Pharmaceutical compositions for use in accordance with the present invention 
thus may be fonnulated in a conventional manner using one or more physiologically 
acceptable carriers comprising excipients and auxiliaries which facilitate processing of 
the active compounds into preparations which can be used pharmaceutically. These 

S pharmaceutical compositions may be manufactured in a manner that is itself known, 
by means of conventional mixing, dissolving, granulating, dragee-making, levigating, 
emulsifying, encapsulating, entrapping or lyophilizing processes. Proper formulation is 
dependent upon the route of administration chosen. When a therapeutically effective 
amount of protein or other active ingredient of the present invention is administered 

10 orally, protein or other active ingredient of the present invention will be in the form of a 
tablet, capsule, powder, solution or elixir. When administered in tablet form, the 
pharmaceutical composition of the invention may additionally contain a solid carrier such 
as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% 
protein or other active ingredient of the present invention, and preferably from about 25 

15 to 90% protein or other active ingredient of the present invention. When administered in 
liquid form, a liquid carrier such as water, petroleum, oils of animal or plant origin such 
as peanut oil, mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The 
liquid form of the pharmaceutical composition may further contain physiological saline 
solution, dextrose or other saccharide solution, or glycols such as ethylene glycol, 

20 propylene glycol or polyethylene glycol. When administered in liquid form, the 

pharmaceutical composition contains from about 0.5 to 90% by weight of protein or other 
active ingredient of the present invention, and preferably from about 1 to 50% protein or 
other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of 

25 the present invention is administered by intravenous, cutaneous or subcutaneous 

injection, protein or other active ingredient of the present invention will be in the form of 
a pyrogen-fi^e, parenterally acceptable aqueous solution. The preparation of such 
parenterally acceptable protein or other active ingredient solutions, having due regard to 
pH, isotonicity, stability, and the like, is within the skill in the art, A preferred 

30 pharmaceutical composition for intravenous, cutaneous, or subcutaneous injection should 
contain, in addition to protein or other active ingredient of the present invention, an 
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isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, Dextrose 
Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or other 
vehicle as known in the art. The pharmaceutical composition of the present invention 
may also contain stabilizers, preservatives, buffers, antioxidants, or other additives 
5 known to those of skill in the art. For injection, the agents of the invention may be 

formulated in aqueous solutions, preferably in physiologically compatible buffers such as 
Hanks*s solution. Ringer's solution, or physiological saline buffer. For transmucosal 
administration, penetrants appropriate to the barrier to be permeated are used in the 
formulation. Such penetrants are generally known in the art. 

1 0 For oral administration, the compounds can be formulated readily by combining 

the active compounds with pharmaceutically acceptable carriers well known in the art. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, 
dragees, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral 
ingestion by a patient to be treated. Pharmaceutical preparations for oral use can be 

15 obtained firom a solid excipient, optionally grinding a resulting mixture, and processing 
the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or 
dragee cores. Suitable excipients are, in particular, fillers such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize 
starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, 

20 hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or 

polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the 
cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium 
alginate. Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabic, talc, 

25 polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. DyestufiFs or pigments may 
be added to the tablets or dragee coatings for identification or to characterize different 
combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules 

30 made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as 
glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture 
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with filler such as lactose, binders such as starches, and/or lubricants such as talc or 
magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds 
may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or 
liquid polyethylene glycols. In addition, stabilizers may be added. All formulations for 

5 oral administration should be in dosages suitable for such administration. For buccal 
administration, the compositions may take the form of tablets or lozenges formulated in 
conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 

10 pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 

dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon 
dioxide or other suitable gas. In the case of a pressurized aerosol the dosage imit may be 
determined by providing a valve to deliver a metered amount. Capsules and cartridges 
of, €,g,j gelatin for use in an inhaler or insufflator may be formulated containing a powder 

1 5 mix of the compound and a suitable powder base such as lactose or starch. The 

compounds may be formulated for parenteral administration by injection, e.g., by bolus 
injection or continuous infusion. Formulations for injection may be presented in unit 
dosage form, e.g., in ampules or in multi-dose containers, with an added preservative. 
The compositions may take such forms as suspensions, solutions or emulsions in oily or 

20 aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing 
and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous 
solutions of the active compounds in water-soluble form. Additionally, suspensions of 
the active compounds may be prepared as appropriate oily injection suspensions. 

25 Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic 
fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection 
suspensions may contain substances which increase the viscosity of the suspension, such 
as sodium carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may 
also contain suitable stabilizers or agents which increase the solubility of the compounds 

30 to allow for the preparation of highly concentrated solutions. Alternatively, the active 
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ingredient may be in powder form for constitution with a suitable vehicle, e.g., sterile 
pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as 
suppositories or retention enemas, e.g., containing conventional suppository bases such as 

5 cocoa butter or other glycerides. In addition to the formulations described previously, the 
compounds may also be formulated as a depot preparation. Such long acting 
formulations may be administered by implantation (for example subcutaneously or 
intramuscularly) or by intramuscular injection. Thus, for example, the compounds may 
be formulated with suitable polymeric or hydrophobic materials (for example as an 

10 emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, 
for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- 
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible 
organic polymer, and an aqueous phase. The co-solvent system may be the VPD 

15 co-solvent system. VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar 
surfactant polysorbate 80, and 65% w/v polyethylene glycol 300, made up to volume in 
absolute ethanol. The VPD co-solvent system (VPD:5W) consists of VPD diluted 1:1 
with a 5% dextrose in water solution. This co-solvent system dissolves hydrophobic 
compounds well, and itself produces low toxicity upon systemic administration. 

20 Naturally, the proportions of a co-solvent system may be varied considerably without 
destroying its solubility and toxicity characteristics. Furthemiore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar 
surfactants may be used instead of polysorbate 80; the fraction size of polyethylene 
glycol may be varied; other biocompatible polymers may replace polyethylene glycol, 

25 e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for 
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical 
compounds may be employed. Liposomes and emulsions are well known examples of 
delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents such as 
dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 

30 Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
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Various types of sustained-release materials have been established and are well known by 
those skilled in the art. Sustained-release capsules may, depending on their chemical 
nature, release the compounds for a few weeks up to over 100 days. Depending on the 
chemical nature and the biological stability of the therapeutic reagent, additional 

5 strategies for protein or other active ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited 
to calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 
gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the 

10 invention may be provided as salts with pharmaceutically compatible counter ions. Such 
pharmaceutically acceptable base addition salts are those salts which retain the biological 
efTectiveness and properties of the free acids and which are obtained by reaction with 
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, 
trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodiimi acetate, 

15 potassium benzoate, triethanol amine and the like. 

The pharmaceutical composition of the invention may be in the form of a 
complex of the protein(s) or other active ingredient(s) of present invention along with 
protein or peptide antigens. The protein and/or peptide antigen will deliver a stimulatory 
signal to both B and T lymphocytes. B lymphocytes will respond to antigen through their 

20 surface immunoglobulin receptor. T lymphocytes will respond to antigen through the T 
cell receptor (TCR) following presentation of the antigen by MHC proteins. MHC and 
structurally related proteins including those encoded by class I and class n MHC genes 
on host cells will serve to present the peptide antigen(s) to T lymphocytes. The antigen 
components could also be supplied as purified MHC-peptide complexes alone or with 

25 co-stimulatory molecules that can directly signal T cells. Alternatively antibodies able to 
bind surface immunoglobulin and other molecules on B cells as well as antibodies able to 
bind the TCR and other molecules on T cells can be combined with the pharmaceutical 
composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a 

30 liposome in which protein of the present invention is combined, in addition to other 

pharmaceutically acceptable carriers, with amphipathic agents such as lipids which exist 
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in aggregated fonn as micelles, insoluble monolayers, liquid crystals, or lamellar layers 
in aqueous solution. Suitable lipids for liposomal formulation include, without limitation, 
monoglycerides, diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, 
and the like. Preparation of such liposomal formulations is within the level of skill in the 

5 art, as disclosed, for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 
4,737,323, all of which are incorporated herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and 
severity of the condition being treated, and on the nature of prior treatments which the 

10 patient has undergone. Ultimately, the attending physician will decide the amount of 
protein or other active ingredient of the present invention with which to treat each 
individual patient. Initially, the attending physician will administer low doses of protein 
or other active ingredient of the present invention and observe the patient's response. 
Larger doses of protein or other active ingredient of the present invention may be 

15 administered until the optimal therapeutic effect is obtained for the patient, and at that 
point the dosage is not increased further. It is contemplated that the various 
phannaceutical compositions used to practice the method of the present invention should 
contain about 0.01 ng to about 100 mg (preferably about 0.1 jig to about 10 mg, more 
preferably about 0. 1 [ig to about 1 mg) of protein or other active ingredient of the present 

20 invention per kg body weight. For compositions of the present invention which are 
useful for bone, cartilage, tendon or ligament regeneration, the therapeutic method 
includes administering the composition topically, systematically, or locally as an implant 
or device. When administered, the therapeutic composition for use in this invention is, of 
course, in a pyrogen-firee, physiologically acceptable form. Further, the composition may 

25 desirably be encapsulated or injected in a viscous form for delivery to the site of bone, 
cartilage or tissue damage. Topical administration may be suitable for wound healing 
and tissue repair. Therapeutically useful agents other than a protein or other active 
ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 

30 sequentially with the composition in the methods of the invention. Preferably for bone 
and/or cartilage formation, the composition would include a matrix capable of delivering 
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the protein-containing or other active ingredient-containing composition to the site of 
bone and/or cartilage damage, providing a structure for the developing bone and cartilage 
and optimally capable of being resorbed into the body. Such matrices may be formed of 
materials presently in use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, 
mechanical properties, cosmetic appearance and interface properties. The particular 
application of the compositions will defme the appropriate formulation. Potential 
matrices for the compositions may be biodegradable and chemically defined calcium 
sulfate, tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and 
polyanhydrides. Other potential materials are biodegradable and biologically 
well-defined, such as bone or demial collagen. Further matrices are comprised of pure 
proteins or extracellular matrix components. Other potential matrices are 
nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the 
above mentioned types of material, such as polylactic acid and hydroxyapatite or 
collagen and tricalcium phosphate. The bioceramics may be altered in composition, such 
as in calcium-aluminate-phosphate and processing to alter pore size, particle size, particle 
shape, and biodegradability . Presently preferred is a 50:50 (mole weight) copolymer of 
lactic acid and glycolic acid in the form of porous particles having diameters ranging 
from 150 to 800 microns. In some applications, it will be usefiil to utilize a sequestering 
agent, such as carboxymethyl cellulose or autologous blood clot, to prevent the protein 
compositions from disassociating from the matrix. 

A preferred family of sequestering agents is cellulosic materials such as 
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 
ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, 
hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being 
cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents 
include hyaluronic acid, sodium alginate, poly(ethylene glycol), polyoxyethylene oxide, 
carboxyvinyl polymer and poly(vinyl alcohol). The amount of sequestering agent useful 
herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which 
represents the amount necessary to prevent desoiption of the protein fix)m the polymer 
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matrix and to provide appropriate handling of the composition, yet not so much that the 
progenitor cells are prevented from infiltrating the matrix, thereby providing the protein 
the opportunity to assist the osteogenic activity of the progenitor cells. In further 
compositions, proteins or other active ingredients of the invention maybe combined with 

5 other agents beneficial to the treatment of the bone and/or cartilage defect, wound, or 
tissue in question. These agents include various growth factors such as epidermal growth 
factor (EGF), platelet derived growth factor (PDGF), transforming growth factors 
(TGF«a and TGF-p), and insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary 

10 applications. Particularly domestic animals and thoroughbred horses, in addition to 

humans, are desired patients for such treatment with proteins or other active ingredients 
of the present invention. The dosage regimen of a protein-containing pharmaceutical 
composition to be used in tissue regeneration will be determined by the attending 
physician considering various factors which modify the action of the proteins, 

15 amount of tissue weight desired to be formed, the site of damage, the condition of the 
damaged tissue, the size of a woimd, type of damaged tissue (e.^., bone), the patient's 
age, sex, and diet, the severity of any infection, time of administration and other clinical 
factors- The dosage may vary with the type of matrix used in the reconstitution and with 
inclusion of other proteins in the pharmaceutical composition. For example, the addition 

20 of other known growth factors, such as IGF I (insulin like growth factor I), to the final 
composition, may also effect the dosage. Progress can be monitored by periodic 
assessment of tissue/bone growth and/or repair, for example, X-rays, histomorphoraetric 
determinations and tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 

25 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other 
known methods for introduction of nucleic acid into a cell or organism (including, 
without limitation, in the form of viral vectors or naked DNA). Cells may also be 
cultured ex vivo in the presence of proteins of the present invention in order to proliferate 

30 or to produce a desired effect on or activity in such cells. Treated cells can then be 
introduced in vivo for therapeutic purposes. 
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3-12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to 

5 achieve its intended purpose. More specifically, a therapeutically effective amount 
means an amount effective to prevent development of or to alleviate the existing 
symptoms of the subject being treated. Determination of the effective amount is well 
within the capability of those skilled in the art, especially in light of the detailed 
disclosure provided herein. For any compound used in the method of the invention, the 

10 therapeutically effective dose can be estimated initially from appropriate in vitro assays. 
For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be fonnulated in animal models to achieve a 
circulating concentration range that includes the ICso ^ detennined in cell culture (i.^., 

1 5 the concentration of the test compound which achieves a half-maximal inhibition of the 
protein's biological activity). Such information can be used to more accurately determine 
useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results 
in amelioration of symptoms or a prolongation of survival in a patient. Toxicity and 

20 therapeutic efficacy of such compounds can be determined by standard pharmaceutical 
procedures in cell cultures or experimental animals, e.g,, for determining the LD50 (the 
dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 
50% of the population). The dose ratio between toxic and therapeutic effects is the 
therapeutic index and it can be expressed as the ratio between LDso and ED50. 

25 Compounds which exhibit high therapeutic indices are prefenred. The data obtained from 
these cell culture assays and animal studies can be used in formulating a range of dosage 
for use in human. The dosage of such compounds lies preferably within a range of 
circulating concentrations that include the ED50 with little or no toxicity. The dosage 
may vary within this range depending upon the dosage form employed and the route of 

30 administration utilized. The exact formulation, route of administration and dosage can be 
chosen by the individual physician in view of the patient's condition. See, e.g., Fingl et 
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al., 1975, in "The Pharmacological Basis of Therapeutics", Ch. 1 p. I. Dosage amount 

and interval may be adjusted individually to provide plasma levels of the active moiety 

which are sufficient to maintain the desired effects, or minimal effective concentration 

(MEC). The MEC will vary for each compound but can be estimated firom in vitro data. 
5 Dosages necessary to achieve the MEC will depend on individual characteristics and 

route of administration. However, HPLC assays or bioassays can be used to determine 

plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should 

be administered using a regimen which maintains plasma levels above the MEC for 
10 10-90% of the time, preferably between 30-90% and most preferably between 50-90%. 

In cases of local administration or selective uptake, the effective local concentration of 

the drug may not be related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the 

invention will be in the range of about 0.01 ^ig/kg to 100 mg/kg of body weight daily, 
15 with the preferred dose being about 0.1 |ig/kg to 25 mg/kg of patient body weight daily, 

varying in adults and children- Dosing may be once daily, or equivalent doses may be 

delivered at longer or shorter intervals. 

The amount of composition administered will, of course, be dependent on the 

subject being treated, on the subject's age and weight, the severity of the affliction, the 
20 manner of administration and the judgment of the prescribing physician. 

3.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device 
which may contain one or more unit dosage fonns containing the active ingredient. The 
25 pack may, for example, comprise metal or plastic foil, such as a blister pack. The pack or 
dispenser device may be accompanied by instructions for administration. Compositions 
comprising a compound of the invention formulated in a compatible pharmaceutical 
carrier may also be prepared, placed in an appropriate container, and labeled for 
treatment of an indicated condition. 

30 

3.13 ANTIBODIES 
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Also included in the invention are antibodies to proteins, or fragments of proteins 
of the invention. The term "antibody" as used herein refers to inununoglobuUn molecules 
and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules 
that contain an antigen-binding site that specifically binds (immunoreacts with) an 

5 antigen. Such antibodies include, but are not limited to, polyclonal, monoclonal, 
chimeric, single chain. Fab, Fab* and F(ab')2 fragments, and an Fab expression library. In 
general, an antibody molecule obtained from humans relates to any of the classes IgG, 
IgM, IgA, IgE and IgD, which differ from one another by the nature of the heavy chain 
present in the molecule. Certain classes have subclasses as well, such as IgGi, IgGz, and 

10 others. Furthermore, in humans, the light chain may be a kappa chain or a lambda chain. 
Reference herein to antibodies includes a reference to all such classes, subclasses and 
types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an 
antigen, or a portion or fragment thereof, and additionally can be used as an immunogen 

IS to generate antibodies that immunospecifically bind the antigen, using standard 

techniques for polyclonal and monoclonal antibody preparation. The full-length protein 
can be used or, alternatively, the invention provides antigenic peptide fragments of the 
antigen for use as immunogens. An antigenic peptide fragment comprises at least 6 
amino acid residues of the amino acid sequence of the full length protein, such as an 

20 amino acid sequence shown in SEQ ID NO; 527 -1052, and encompasses an epitope 

thereof such that an antibody raised against the peptide forms a specific immune complex 
with the full length protein or with any fragment that contains the epitope. Preferably, 
the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino acid 
residues, or at least 20 amino acid residues, or at least 30 amino ,acid residues. Preferred 

25 epitopes encompassed by the antigenic peptide are regions of the protein that are located 
on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a surface region of the protein, e.g., a hydrophilic region. A 
hydrophobicity analysis of the human related protein sequence will indicate which 

30 regions of a related protein are particularly hydrophilic and, therefore, are likely to 
encode surface residues useful for targeting antibody production. As a means for 
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targeting antibody production, hydropathy plots showing regions of hydrophilicity and 
hydrophobicity may be generated by any method well known in the art, including, for 
example, the Kyte Doolittle or the Hopp Woods methods, either with or without Fourier 
transformation. See, e.g., Hopp and Woods, 1981, Proc. Nat Acad. ScL USA 78: 3824- 
5 3828; Kyte and Doolittle 1 982, J. MoL BioL 1 57: 105-142, each of which is incorporated 
herein by reference in its entirety. Antibodies that are specific for one or more domains 
within an antigenic protein, or derivatives, fi-agments, analogs or homologs thereof, are 
also provided hereiii. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

1 0 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

The term "specific for" indicates that the variable regions of the antibodies of the 
invention recognize and bind polypeptides of the invention exclusively (r.e, able to 
distinguish the polypeptide of the invention firom other similar polypeptides despite 

1 5 sequence identity, homology, or similarity found in the family of polypeptides), but may 
also interact with other proteins (for example, 5. aureus protein A or other antibodies in 
ELISA techniques) through interactions with sequences outside the variable region of the 
antibodies, and in particular, in the constant region of the molecule. Screening assays to 
determine binding specificity of an antibody of the invention are well known and 

20 routinely practiced in the art. For a comprehensive discussion of such assays, see Harlow 
et al. (Eds), Antibodies A Laboratory Manual; Cold Spring Harbor Laboratory; Cold 
Spring Harbor, NY (1988), Chapter 6. Antibodies that recognize and bind firagments of 
the polypeptides of the invention are also contemplated, provided that the antibodies are 
first and foremost specific for, as defined above, fiiU-length polypeptides of the 

25 invention. As with antibodies that are specific for fiill length polypeptides of the 
invention, antibodies of the invention that recognize fi-agments are those which can 
distinguish polypeptides from the same family of polypeptides despite inherent sequence 
identity, homology, or similarity found in the family of proteins. 

Antibodies of the mvention are usefiil for, for example, therapeutic purposes (by 

30 modulating activity of a polypeptide of the invention), diagnostic purposes to detect or 
quantitate a polypeptide of the invention, as well as purification of a polypeptide of the 
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invention. Kits comprising an antibody of the invention for any of the purposes 
described herein are also comprehended. In general, a kit of the invention also includes a 
control antigen for which the antibody is immunospecific. The invention further provides 
a hybridoma that produces an antibody according to the invention. Antibodies of the 

S invention are useful for detection and/or purification of the polypeptides of the invention. 
Monoclonal antibodies binding to the protein of the invention may be useful 
diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal 
antibodies binding to the protein may also be useful therapeutics for both conditions 
associated with the protein and also in the treatment of some forms of cancer where 

10 abnormal expression of the protein is involved. In the case of cancerous cells or 

leukemic cells, neutralizing monoclonal antibodies against the protein may be useful in 
detecting and preventing the metastatic spread of the cancerous cells, which may be 
mediated by the protein. 

The labeled antibodies of the present invention can be used for in vitro, in vivo, 

1 5 and in situ assays to identify cells or tissues in which a fragment of the polypeptide of 
interest is expressed. The antibodies may also be used directly in therapies or other 
diagnostics. The present invention further provides the above-described antibodies 
immobilized on a solid support. Examples of such solid supports include plastics such as 
polycarbonate, complex carbohydrates such as agarose and Sepharose®, acrylic resins 

20 and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such 
solid supports are well known in the art (Weir, D.M. et al, "Handbook of Experimental 
Immunology" 4th Ed., Blackwell Scientific Pubhcations, Oxford, England, Chapter 10 
(1986); Jacoby, W.D. et al., Meth. Enzym. 34 Academic Press, N.Y. (1974)). The 
inunobilized antibodies of the present invention can be used for in vitro, in vivo, and in 

25 situ assays as well as for inmiuno-af&nity purification of the proteins of the present 
invention. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, 
30 Antibodies: A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor 
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Laboratory Press, Cold Spring Harbor, NY, incorporated herein by reference). Some of 
these antibodies are discussed below. 

3.13.1 POLYCLONAL ANTIBODIES 
S For the production of polyclonal antibodies, various suitable host animals (e.g., 

rabbit, goat, mouse or other mammal) may be immunized by one or more injections with 
the native protein, a synthetic variant thereof, or a derivative of the foregoing. An 
appropriate immunogenic preparation can contain, for example, the naturally occurring 
immunogenic protein, a chemically synthesized polypeptide representing the 

10 immunogenic protein, or a recombinantly expressed immunogenic protein. Furthermore, 
the protein may be conjugated to a second protein known to be immunogenic in the 
mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and 
soybean trypsin inhibitor. The preparation can further include an adjuvant. Various 

IS adjuvants used to increase the immunological response include, but are not limited to, 
Freund*s (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface- 
active substances (e.g., lysolecithin, pluronic polyols, polyanions;, peptides; oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

20 adjuvants that can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can 
be isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as afiSnity chromatography using protein A or protein G, which provide 

25 primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by inmiunoaffinity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkmson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 

30 8 (April 1 7, 2000), pp. 25-28). 
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3.13.2 MONOCLONAL ANTIBODIES 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", 
as used herein, refers to a population of antibody molecules that contain only one 
molecular species of antibody molecule consisting of a unique light chain gene product 
and a unique heavy chain gene product. In particular, the complementarity determining 
regions (CDRs) of the monoclonal antibody are identical in all the molecules of the 
population. MAbs thus contain an antigen-binding site capable of immunoreacting with a 
particular epitope of the antigen characterized by a unique binding affinity for it 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immtmized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing 
antibodies that will specifically bind to the immunizing agent. Alternatively, the 
lymphocytes can be immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment 
thereof or a fusion protein thereof. Generally, either peripheral blood lymphocytes are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if 
non-human mammalian sources are desired. The lymphocytes are then fused with an 
immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form 
a hybridoma cell (Coding, Monoclonal Antibodies: Principles and Practice , Academic 
Press, (1986) pp. 59-103). hnmortalized cell lines are usually transformed mammalian 
cells, particularly myeloma cells of rodent, bovine and human origin. Usually, rat or 
mouse myeloma cell lines are employed. The hybridoma cells can be cultured in a 
suitable culture medium that preferably contains one or more substances that inhibit the 
growth or survival of the unfused, immortalized cells. For example, if the parental cells 
lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), 
the culture mediimi for the hybridomas typically will include hypoxanthine, aminopterin, 
and thymidine ("HAT medium"), which substances prevent the growth of HGPRT- 
deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable 
high level expression of antibody by the selected antibody-producing cells, and are 
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sensitive to a medium such as HAT medium. More preferred immortalized cell lines are 
murine myeloma lines, which can be obtained, for instance, from the Salk Institute Cell 
Distribution Center, San Diego, California and the American Type Culture Collection, 
Manassas, Virginia. Human myeloma and mouse-human heteromyeloma cell lines also 

, 5 have been described for the production of human monoclonal antibodies (Kozbor, J. 

Immunol. , 133:3001 (1984); Brodeur et al, Monoclonal Antibody Production Techniques 
and Applications, Marcel Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be 
assayed for the presence of monoclonal antibodies directed against the antigen, 

10 Preferably, the binding specificity of monoclonal antibodies produced by the hybridoma 
cells is determined by immunoprecipitation or by an in vitro binding assay, such as 
radioimmunoassay (RIA) or enzyme-linked immunoabsorbent assay (ELIS A). Such 
techniques and assays are known in the art. The binding affinity of the monoclonal 
antibody can, for example, be determined by the Scatchard analysis of Munson and 

15 Pollard, Anal Biochem.. 107:220 (1980), Preferably, antibodies having a high degree of 
specificity and a high binding afiinity for the target antigen are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for 
this pmpose include, for example, Dulbecco*s Modified Eagle's Medium and RPMI-1640 

20 medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a 
mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified 
fit)m the culture medium or ascites fluid by conventional immunoglobulin purification 
procedures such as, for example, protein A-Sepharose, hydroxylapatite chromatography, 
25 gel electrophoresis, dialysis, or afiinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such 
as those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal 
antibodies of the invention can be readily isolated and sequenced using conventional 
procedures (e.g., by using oligonucleotide probes that are capable of binding specifically 
30 to genes encoding the heavy and light chains of murine antibodies). The hybridoma cells 
of the invention serve as a preferred source of such DNA. Once isolated, the DNA can 
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be placed into expression vectors, which are then transfected into host cells such as 
simian COS cells, Chinese hamster ovary (CHO) cells, or myeloma cells that do not 
otherwise produce immunoglobulin protein, to obtain the synthesis of monoclonal 
antibodies in the recombinant host cells. The DNA also can be modified, for example, by 

5 substituting the coding sequence for human heavy and light chain constant domains in 
place of the homologous murine sequences (U.S. Patent No, 4,816,567; Morrison, Nature 
368 . 812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all 
or part of the coding sequence for a non-immunoglobulin polypeptide. Such a non- 
immunoglobulin polypeptide can be substituted for the constant domains of an antibody 

10 of the invention, or can be substituted for the variable domains of one antigen-combining 
site of an antibody of the invention to create a chimeric bivalent antibody. 

3-13.3 HUMANIZED ANTIBODIES 

The antibodies directed against the protein antigens of the invention can further 

IS comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 
the administered immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab*, 
F(ab')2 or other antigen-binding subsequences of antibodies) that are principally 

20 comprised of the sequence of a human inununoglobulin, and contain minimal sequence 
derived firom a non-human immunoglobulin. Humanization can be performed following 
the method of Winter and co-workers (Jones et aL, Nature, 321:522-525 (1986); 
Riechmann ct al., Nature, 332:323-327 (1988); Verhoeyen et al.. Science, 239:1534-1536 
(1988)), by substituting rodent CDRs or CDR sequences for the corresponding sequences 

25 of a human antibody. (See also U.S. Patent No. 5,225,539). In some instances, Fv 
fiamework residues of the human immunoglobulin are replaced by corresponding non- 
human residues. Humanized antibodies can also comprise residues that are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, 
the humanized antibody will comprise substantially all of at least one, and typically two, 

30 variable domains, in which all or substantially all of the CDR regions correspond to those 
of a non-human immunoglobulin and all or substantially all of the framework regions are 

96 



wo 02/074961 



PCT/US02/05109 



those of a human immunoglobulin consensus sequence. The humanized antibody 
optimally also will comprise at least a portion of an immunoglobulin constant region 
(Fc), typically that of a human immunoglobulin (Jones et al., 1986; Riechmann et al., 
1988; and Presta, Curr. Op. Struct Biol,, 2:593-596 (1992)). 

5 

3.13.4 HUMAN ANTIBODIES 

Fully human antibodies relate to antibody molecules in which essentially the 
entire sequences of both the light chain and the heavy chain, including the CDRs, arise 
jfrom human genes. Such antibodies are termed "human antibodies", or "fully human 

10 antibodies" herein. Human monoclonal antibodies can be prepared by the trioma 
technique; the human B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol 
Today 4: 72) and the EBV hybridoma technique to produce human monoclonal 
antibodies (see Cole, et al., 1985 In: Monoclonal Antibodies and Cancer Therapy, 
Alan R. Liss, Inc., pp. 77-96). Human monoclonal antibodies may be utilized in the 

1 5 practice of the present invention and may be produced by using human hybridomas (see 
Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by transforming human 
B-cells with Epstein Barr Virus in vitro (see Cole, et al, 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

20 including phage display libraries (Hoogenboom and Winter, J. Mol. Biol. . 227:381 

(1991); Marks et al., J. Mol. Biol.. 222:581 (1991)). Similarly, human antibodies can be 
made by introducing human immunoglobulin loci into transgenic animals, e.g., mice in 
which the endogenous inununoglobulin genes have been partially or completely 
inactivated. Upon challenge, human antibody production is observed, which closely 

25 resembles that seen in humans in all respects, including gene reairangement, assembly, 
and antibody repertoire. This ^proach is described, for example, in U.S. Patent Nos. 
5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661.016, and in Marks et al. 
(Bio/Technoloev 10, 779-783 (1992)); Lonberg et al. (Nature 368 856-859 (1994)); 
Morrison (Nature 368, 812-13 (1994)); Fishwild et al, (Nature Biotechnology 14, 845-51 

30 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar 
(Intern. Rev. Immunol. 13 65-93 (1995)). 
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Human antibodies may additionally be produced using transgenic nonhuman 
animals that are modified so as to produce fully human antibodies rather than the 
animal's endogenous antibodies in response to challenge by an antigen. (See PCX 
publication WO94/02602). The endogenous genes encoding the heavy and light 
immunoglobulin chains in the nonhuman host have been incapacitated, and active loci 
encoding human heavy and light chain immunoglobulins are inserted into the host's 
genome. The human genes are incorporated, for example, using yeast artificial 
chromosomes containing the requisite human DNA segments. An animal which provides 
all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the 
Xenomouse™ as disclosed in PCX publications WO 96/33735 and WO 96/34096. This 
animal produces B cells that secrete fiilly human immunoglobulins. The antibodies can 
be obtained directly firom the animal after immunization with an inununogen of interest, 
as, for example, a preparation of a polyclonal antibody, or alternatively from 
immortalized B cells derived firom the animal, such as hybridomas producing monoclonal 
antibodies. Additionally, the genes encoding the immunoglobulins with human variable 
regions can be recovered and expressed to obtain the antibodies directly, or can be fiirther 
modified to obtain analogs of antibodies such as, for example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, 
lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment 
genes firom at least one endogenous heavy chain locus in an embryonic stem cell to 
prevent rearrangement of the locus and to prevent formation of a transcript of a 
reananged inununoglobulin heavy chain locus, the deletion being effected by a targeting 
vector containing a gene encoding a selectable marker; and producing &om the 
embryonic stem cell a transgenic mouse whose somatic and germ cells contain the gene 
encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is 
disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression vector that 
contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 
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culture, introducing an expression vector containing a nucleotide sequence encoding a 
light chain into another mammalian host cell, and fusing the two cells to form a hybrid 
cell. The hybrid cell expresses an antibody containing the heavy chain and the light 
chain. 

5 In a further improvement on this procedure, a method for identifying a clinically 

relevant epitope on an inununogen, and a correlative method for selecting an antibody 
that binds immunospecifically to the relevant epitope with high affiruty, are disclosed in 
PCX publication WO 99/53049. 

10 3.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES 

According to the invention, techniques can be adapted for the production of 
single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. 
Patent No. 4,946,778). In addition, methods can be adapted for the construction of Fab 
expression libraries (see e.g., Huse, et al, 1989 Science 246: 1275-1281) to allow rapid 

15 and effective identification of monoclonal Fab fragments with the desired specificity for a 
protein or derivatives, fragments, analogs or homologs thereof. Antibody fragments that 
contain the idiotypes to a protein antigen may be produced by techniques known in the 
art including, but not limited to: (i) an F(ab')2 fragment produced by pepsin digestion of an 
antibody molecule; (ii) an Fab fragment generated by reducing the disulfide bridges of an 

20 F(abo2 fragment; (iii) an Fab fragment generated by the treatment of the antibody molecule 
with papain and a reducing agent and (iv) Fv fragments. 

3.13.6 BISPECIFIC ANTIBODIES 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
25 that have binding specificities for at least two different antigens. In the present case, one 
of the binding specificities is for an antigenic protein of the invention. The second 
binding target is any other antigen, and advantageously is a cell-surface protein or 
receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
30 recombinant production of bispecific antibodies is based on the co-expression of two 
hnmunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have 
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different specificities (Milstein and Cuello, Nature . 305:537-539 (1983)). Because of the 
random assortment of immunoglobulin heavy and light chains, these hybridomas 
(quadromas) produce a potential mixture of ten different antibody molecules, of which 
only one has the correct bispecific structure. The purification of the correct molecule is 
5 usually accomplished by affinity chromatography steps. Similar procedures are disclosed 
in WO 93/08829, published 13 May 1993, and in Traunecker et aL, 1991 EMBOJ., 
10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody- 
antigen combining sites) can be fused to immunoglobulin constant domain sequences. 

10 The fusion preferably is with an immunoglobulin heavy-chain constant domain, 

comprising at least part of the hinge, CH2, and CH3 regions. It is preferred to have the 
first heavy-chain constant region (CHI) containing the site necessary for light-chain 
binding present in at least one of the fusions. DNAs encoding the immunoglobulin 
heavy-chain fusions and, if desired, the immunoglobulin light chain, are inserted into 

15 separate expression vectors, and are co-transfected into a suitable host organism. For 

further details of generating bispecific antibodies see, for example, Suresh et al., Methods 
inEnzvmoloev. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between 
a pair of antibody molecules can be engineered to maximize the percentage of . 

20 heterodimers that are recovered from recombinant cell culture. The preferred interface 
comprises at least a part of the CH3 region of an antibody constant domain. In this 
method, one or more small amino acid side chains firom the interface of the first antibody 
molecule are replaced with larger side chains (e.g. tyrosine or tryptophan). 
Compensatory "cavities" of identical or similar size to the large side chain(s) are created 

25 on the interface of the second antibody molecule by replacing large amino acid side 
chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as 
homodimers. 

Bispecific antibodies can be prepared as fiill length antibodies or antibody 
30 firagments (e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific 

antibodies firom antibody fragments have been described in the literature. For example, 
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bispecific antibodies can be prepared using chemical linkage. Brennan et al., Science 
229:81 (1985) describe a procedure wherein intact antibodies are proteolytically cleaved 
to generate F(ab')2 fragments. These fragments are reduced in the presence of the dithiol 
complexing agent sodium arsenite to stabilize vicinal dithiols and prevent intermolecular 

S disulfide formation. The Fab* fragments generated are then converted to 

thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB derivatives is then 
reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is mixed with an 
equimolar amount of the other Fab'-TNB derivative to form the bispecific antibody. The 
bispecific antibodies produced can be used as agents for the selective immobilization of 

10 enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and 
chemically coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 
1 75:21 7-225 (1 992) describe the production of a fully humanized bispecific antibody 
F(ab*)2 molecule. Each Fab' fragment was separately secreted from E, coli and subjected 

15 to directed chemical coupling in vitro to form the bispecific antibody. The bispecific 
antibody thus formed was able to bind to cells overexpressing the ErbB2 receptor and 
normal human T cells, as well as trigger the lytic activity of human cytotoxic 
lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments 

20 directly from recombinant cell culture have also been described. For example, bispecific 
antibodies have been produced using leucine zippers. Kostelny et al., J. Immunol. 
148(5): 1547-1 553 (1992). The leucine zipper peptides from the Fos and Jun proteins 
were linked to the Fab' portions of two different antibodies by gene fusion. The antibody 
homodimers were reduced at the hinge region to form monomers and then re-oxidized to 

25 form the antibody heterodimers. This method can also be utilized for the production of 
antibody homodimers. The "diabody" technology described by HoUinger et al., Proc. 
Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an alternative mechanism for 
making bispecific antibody fragments. The firagments comprise a heavy-chain variable 
domain (Vh) connected to a light-chain variable domain (Vl) by a linker which is too 

30 short to allow pairing between the two domains on the same chain. Accordingly, the Vh 
and Vl domains of one fragment are forced to pair with the complementary Vl and Vh 

101 



wo 02/074961 



PCT/US02/05109 



domains of another fragment, thereby forming two antigen-binding sites. Another 
strategy for making bispecific antibody fragments by the use of single-chain Fv (sFv) 
dimers has also been reported. See, Gruber et al., J. Immunol. 1 52:5368 (1 994). 
Antibodies with more than two valencies are contemplated. For example, 

5 trispecific antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic 
arm of an immunoglobulin molecule can be combined with an arm which binds to a 
triggering molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CDS, 

10 CD28, or B7), or Fc receptors for IgG (Fc-yR), such as Fc^RI (CD64), Fc->RII (CD32) and 
FcyRIII (CD 16) so as to focus cellular defense mechanisms to the cell expressing the 
particular antigen. Bispecific antibodies can also be used to direct cytotoxic agents to 
cells which express a particular antigen. These antibodies possess an antigen-binding 
arm and an arm which binds a cytotoxic agent or a radionuclide chelator, such as 

15 EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest binds the 
protein antigen described herein and further binds tissue factor (TF). 

3.13.7 HETEROCONJUGATE ANTIBODIES 

Heteroconjugate antibodies are also within the scope of the present invention. 

20 Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted 
cells (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; 
WO 92/200373; EP 03089). It is contemplated that the antibodies can be prepared in 
vitro using known methods in synthetic protein chemistry, including those involving 

25 crosslinking agents. For example, immunotoxins can be constructed using a disulfide 
exchange reaction or by forming a thioether bond. Examples of suitable reagents for this 
purpose include iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, 
for example, in U.S. Patent No. 4,676,980. 

30 3.13.8 EFFECTOR FUNCTION ENGINEERING 
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It can be desirable to modify the antibody of the invention with respect to effector 
function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 

S generated can have improved internalization capability and/or increased complement- 
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron 
etal., J. Exp Med., 176: 1191-1195 (1992)and Shopes, J. Immunol., 148:2918-2922 
(1992). Homodimeric antibodies with enhanced anti-tumor activity can also be prepared 
using heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53: 

10 2560-2565 (1993). Alternatively, an antibody can be engineered that has dual Fc 

regions and can thereby have enhanced complement lysis and ADCC capabilities. See 
Stevenson et al. Anti-Cancer Dmg Design, 3: 219-230 (1989). 

3.13-9 IMMUNOCONJUGATES 

15 The invention also pertains to immunoconjugates comprising an antibody 

conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an 
enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments 
thereof), or a radioactive isotope (i.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 

20 been described above. Enzymatically active toxins and fragments thereof that can be 
used include diphtheria A chain, nonbinding active fragments of diphtheria toxin, 
exotoxin A chain (from Pseudomonas aemginosa), ricin A chain, abrin A chain, 
modeccin A chain, alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca 
americana proteins (PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, 

25 crotin^ s^aonaria officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, 
enomycin, and the tricothecenes. A variety of radionuclides are available for the 
production of radioconjugated antibodies. Examples include ^^^Bi, ^^^I, ^^^In, and 
^«^Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
30 bifiinctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) 

propionate (SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as 
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dimethyl adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes 
(such as glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) 
hexanediamine), bis-diazonium derivatives (such as bis-(p-dia2oniumbenzoyl)- 
efliylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), and bis-active 

5 fluorine compounds (such as 1 ,5-difluoro-2,4-dinitrobenzene). For example, a ricin 
immunotoxin can be prepared as described in Vitetta et al,. Science, 238: 1098 (1987). 
Carbon-14-labeled l-isothiocyanatobenzyl-S-methyldiethylenetriaminepentaacetic acid 
(MX-DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the 
antibody. See W094/1 1026. 

10 In another embodiment, the antibody can be conjugated to a "receptor" (such 

streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate 
is administered to the patient, followed by removal of unbound conjugate from the 
circulation using a clearing agent and then administration of a "ligand" (e.g., avidin) that 
is in turn conjugated to a cytotoxic agent 

15 



3.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 

20 readable media" refers to any medium which can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such as 
floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as 
CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
categories such as magnetic/optical storage media. A skilled artisan can readily 

25 appreciate how any of the presently known computer readable mediums can be used to 
create a manufacture comprising computer readable mediiun having recorded thereon a 
nucleotide sequence of the present invention. As used herein, "recorded" refers to a 
process for storing information on computer readable medium. A skilled artisan can 
readily adopt any of the presently known methods for recording information on computer 

30 readable medium to generate manufactures comprising the nucleotide sequence 
information of the present invention. 
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A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means 
chosen to access the stored information. In addition, a variety of data processor programs 

5 and formats can be used to store the nucleotide sequence information of the present 
invention on computer readable medium. The sequence infonnation can be represented 
in a word processing text file, formatted in commercially-available software such as 
WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a 
database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can 

10 readily adapt any number of data processor structuring formats {e.g, text file or database) 
in order to obtain computer readable medium having recorded thereon the nucleotide 
sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NOs: I - S26 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of 

15 the nucleotide sequences of SEQ ID NOs: 1 - 526 in computer readable form, a skilled 
artisan can routinely access the sequence infonnation for a variety of purposes. 
Computer software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which follow 
demonstrate how software which implements the BLAST (Altschul et ah, J. Mol. Biol. 

20 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) 
search algorithms on a Sybase system is used to identify open reading firames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding Augments and may 
be useful in producing conmiercially important proteins such as enzymes used in 
fermentation reactions and in the production of conmiercially useful metabolites. 

25 As used herein, "a computer-based system" refers to the hardware means, 

software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The minimum hardware means of the 
computer-based systems of the present invention comprises a central processing unit 
(CPU), input means, output means, and data storage means. A skilled artisan can readily 

30 appreciate that any one of the currently available computer-based systems are suitable for 
use in the present invention. As stated above, the computer-based systems of the present 
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invention comprise a data storage means having stored therein a nucleotide sequence of 
the present invention and the necessary hardware means and software means for 
supporting and implementing a search means. As used herein, "data storage means*' 
refers to memory which can store nucleotide sequence information of the present 
5 invention, or a memory access means which can access manufactures having recorded 
thereon the nucleotide sequence infomiation of the present invention. 

As used herein, "search means" .refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage means. 

10 Search means are used to identify fragments or regions of a known sequence which 
match a particular target sequence or target motif. A variety of known algorithms are 
disclosed publicly and a variety of commercially available software for conducting search 
means are and can be used in the computer-based systems of the present invention. 
Examples of such software includes, but is not limited to, Smith- Waterman, MacPattem 

15 (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A skilled artisan can readily 
recognize that any one of the available algorithms or implementing software packages for 
conducting homology searches can be adapted for use in the present computer-based 
systems. As used herein, a "target sequence" can be any nucleic acid or amino acid 
sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 

20 readily recognize that the longer a target sequence is, the less likely a target sequence will 
be present as a random occurrence in the database. The most preferred sequence length 
of a target sequence is from about 10 to 300 amino acids, more preferably from about 30 
to 100 nucleotide residues. However, it is well recognized that searches for 
commercially important fragments, such as sequence fragments involved in gene 

25 expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) are 
chosen based on a three-dimensional configuration which is formed upon the folding of 
the target motif. There are a variety oftarget motifs known in the art. Protein target 

30 motifs include, but are not limited to, enzyme active sites and signal sequences. Nucleic 
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acid target motifs include, but are not limited to, promoter sequences, hairpin structures 
and inducible expression elements (protein binding sequences). 

3.15 TRIPLE HELIX FORMATION 

5 In addition, the fragments of the present invention, as broadly described, can be 

used to control gene expression through triple helix formation or antisense DNA or RNA, 
both of which methods are based on the binding of a polynucleotide sequence to DNA or 
RNA. Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in 
length and are designed to be complementary to a region of the gene involved in 

10 transcription (triple helix - see Lee et al, Nucl. Acids Res. 6:3073 (1979); Cooney et al., 
Science 15241:456 (1988); and Dervan et al.. Science 251:1360 (1991)) or to the mRNA 
itself (antisense - Ohnno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as 
Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple 
helix-fonnation optimally results in a shut-off of RNA transcription from DNA, while 

1 5 antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. 
Both techniques have been demonstrated to be effective in model systems. Infonnation 
contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

20 3.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or 
expression of one of the ORFs of the present invention, or homolog thereof, in a test 
sample, using a nucleic acid probe or antibodies of the present invention, optionally 
conjugated or otherwise associated with a suitable label. 

25 In general, methods for detecting a polynucleotide of the invention can comprise 

contacting a sample with a compound that binds to and forms a complex with the 
polynucleotide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polynucleotide of the invention is detected in the sample. 
Such methods can also comprise contacting a sample under stringent hybridization 

30 conditions with nucleic acid primers that anneal to a polynucleotide of the invention 
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under such conditions, and amplifying annealed polynucleotides, so that if a 
polynucleotide is amplified, a polynucleotide of the invention is detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 

5 polypeptide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying 
for binding of the nucleic acid probes or antibodies to components within the test sample. 

10 Conditions for incubating a nucleic acid probe or antibody with a test sample 

vary. Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the nucleic acid probe or antibody used in 
the assay. One skilled in the art will recognize that any one of the commonly available 
hybridization, amplification or immunological assay formats can readily be adapted to 

1 5 employ the nucleic acid probes or antibodies of the present invention. Examples of such 
assays can be found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, 
G.R. et al., Techniques in Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 
(1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of immunoassays: 

20 Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science 
Publishers, Amsterdam, The Netherlands (1985). The test samples of the present 
invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described 
method will vary based on the assay format, nature of the detection method and the 

25 tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein 
extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain 
the necessary reagents to carry out the assays of the present invention. Specifically, the 

30 invention provides a compartment kit to receive, in close confinement, one or more 
containers which comprises: (a) a first container comprising one of the probes or 
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antibodies of the present invention; and (b) one or more other containers comprising one 
or more of the following: wash reagents, reagents capable of detecting presence of a 
bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in 

S separate containers. Such containers include small glass containers, plastic containers or 
strips of plastic or paper. Such containers allows one to efficiently transfer reagents from 
one compartment to another compartment such that the samples and reagents are not 
cross-<x)ntaminated, and the agents or solutions of each container can be added in a 
quantitative fashion fi'om one compartment to another. Such containers will include a 

10 container which will accept the test sample, a container which contains the antibodies 
used in the assay, containers which contain wash reagents (such as phosphate buffered 
saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the 
bound antibody or probe. Types of detection reagents include labeled nucleic acid probes, 
labeled secondary antibodies, or in the alternative, if the primary antibody is labeled, the 

1 5 enzymatic, or antibody binding reagents which are capable of reacting with the labeled 
antibody. One skilled in the ait will readily recognize that the disclosed probes and 
antibodies of the present invention can be readily incorporated into one of the established 
kit formats which are well knovm in the art. 

20 3.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in 
medical imaging of sites expressing the molecules of the invention (e.g., where the 
polypeptide of the invention is involved in the immune response, for imaging sites of 
inflammation or infection). See, e.g., Kunkel et aL, U.S. Pat. NO. 5,413,778. Such 

25 methods involve chemical attachment of a labeling or imaging agent, administration of 
the labeled polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging 
the labeled polypeptide in vivo at the target site. 

3.18 SCREENING ASSAYS 
30 Using the isolated proteins and polynucleotides of the invention, the present 

invention further provides methods of obtaining and identifying agents which bind to a 
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polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set 
forth in SEQ ID NOs: 1 - 526, or bind to a specific domain of the polypeptide encoded by 
the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the 
5 present invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a 
polynucleotide of the invention for a time sufficient to form a polynucleotide/compound 

1 0 complex, and detecting the complex, so that if a polynucleotide/compound complex is 
detected, a compound that binds to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind 
to a polypeptide of the invention can comprise contacting a compound with a polypeptide 
of the invention for a time sufficient to form a polypeptide/compound complex, and 

1 5 detecting the complex, so that if a polypeptide/compound complex is detected, a 
compound that binds to a polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention 
can also comprise contacting a compound with a polypeptide of the invention in a cell for 
a time sufficient to form a polypeptide/compound complex, wherein the complex drives 

20 expression of a receptor gene sequence in the cell, and detecting the complex by 

detecting reporter gene sequence expression, so that if a polypeptide/compound complex 
is detected, a compound that binds a polypeptide of the invention is identified. 

Compoimds identified via such methods can include compounds which modulate 
the activity of a polypeptide of the invention (that is, increase or decrease its activity, 

25 relative to activity observed in the absence of the compound). Alternatively, compounds 
identified via such methods can include compounds which modulate the expression of a 
polynucleotide of the invention (that is, increase or decrease expression relative to 
expression levels observed in the absence of the compound). Compounds, such as 
compounds identified via the methods of the invention, can be tested using standard 

30 assays well known to those of skill in the art for their ability to modulate 
activity/expression. 
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The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be 
selected and screened at random or rationally selected or designed using protein modeling 
techniques. 

5 For random screening, agents such as peptides, carbohydrates, pharmaceutical 

agents and the like are selected at random and are assayed for their ability to bind to the 
protein encoded by the ORF of the present invention. Alternatively, agents may be 
rationally selected or designed. As used herein, an agent is said to be "rationally selected 
or designed" when the agent is chosen based on the configuration of the particular 

10 protein. For example, one skilled in the art can readily adapt currently available 

procedures to generate peptides, pharmaceutical agents and the like, capable of binding to 
a specific peptide sequence, in order to generate rationally designed antipeptide peptides, 
for example see Hurby et al.. Application of Synthetic Peptides: Antisense Peptides," In 
Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 

15 Kaspczak et al.. Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 
In addition to the foregoing, one class of agents of the present invention, as 
broadly described, can be used to control gene expression through binding to one of the 
ORFs or EMFs of the present invention. As described above, such agents can be 
randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a 

20 skilled artisan to design sequence specific or element specific agents, modulating the 
expression of either a single ORF or multiple ORFs which rely on the same EMF for 
expression control. One class of DNA binding agents are agents which contain base 
residues which hybridize or form a triple helix formation by binding to DNA or RNA. 
Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or 

25 can be a variety of sulfhydryl or polymeric derivatives which have base attachment 
capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple 
helix - see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241 :456 
30 (1988); and Dervan et al.. Science 251:1360 (1991)) or to the mRNA itself (antisense - 
Okano, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of 
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Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization 
blocks translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the sequences 

S of the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present 
invention can be used as a diagnostic agent. Agents which bind to a protein encoded by 
one of the ORFs of the present invention can be formulated using known techniques to 

1 0 generate a pharmaceutical composition. 

3 J9 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific 
nucleic acid hybridization probes capable of hybridizing with naturally occurring 

15 nucleotide sequences. The hybridization probes of the subject invention may be derived 
from any of the nucleotide sequences SEQ ID NOs: 1 - 526. Because the corresponding 
gene is only expressed in a limited number of tissues, a hybridization probe derived from 
any of the nucleotide sequences SEQ ID NOs: 1 - 526 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

20 Any suitable hybridization technique can be employed, such as, for example, in 

situ hybridization. PGR as described in US Patents Nos. 4,683,195 and 4,965,188 
provides additional uses for oligonucleotides based upon the nucleotide sequences. Such 
probes used in PGR may be of recombinant origin, may be chemically synthesized, or a 
mixture of both. The probe will comprise a discrete nucleotide sequence for the detection 

25 of identical sequences or a degenerate pool of possible sequences for identification of 
closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include 
the cloning of nucleic acid sequences into vectors for the production of mRNA probes. 
Such vectors are known in the art and are conmiercially available and may be used to 

30 synthesize RNA probes in vitro by means of the addition of the appropriate RNA 
polymerase as T7 or SP6 RNA polymerase and the appropriate radioactively labeled 
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nucleotides. The nucleotide sequences may be used to construct hybridization probes for 
mapping their respective genomic sequences. The nucleotide sequence provided herein 
may be mapped to a chromosome or specific regions of a chromosome using well known 
genetic and/or chromosomal mapping techniques. These techniques include in situ 

5 hybridization, linkage analysis against known chromosomal markers, hybridization 
screening with libraries or flow-sorted chromosomal preparations specific to known 
chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) 
Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

1 0 Fluorescent in situ hybridization of chromosomal preparations and other physical 

chromosome mapping techniques may be correlated with additional genetic map data. 
Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265: 1 98 1 0- Correlation between the location of a nucleic acid on a physical 
chromosomal map and a specific disease (or predisposition to a specific disease) may 

1 5 help delimit the region of DN A associated with that genetic disease. The nucleotide 
sequences of the subject invention may be used to detect differences in gene sequences 
between normal, carrier or affected individuals. 

3.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, maybe readily prepared by, for 
20 example, directly synthesizing the oligonucleotide by chemical means, as is commonly 

practiced using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to 

those of skill in the art using any suitable support such as glass, polystyrene or Teflon. One 

strategy is to precisely spot oligonucleotides synthesized by standard synthesizers. 
25 Immobilization can be achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. 

Microbiol. 28(6). 1469-72); using UV Ught (Nagata et al. 1985; Dahlen et al. 1987; 

Morrissey & Collins, (1989) Mol. Cell Probes 3(2) 189-207) or by covalent binding of base 

modified DNA (Keller et al, 1988; 1989); all references being specifically incorporated 

herein. 

30 Another strategy that may be employed is the use of the strong biotin-strq)tavidin 

interaction as a linker. For example, Broude et al (1994) Proc. Natl. Acad. Sci. USA 91(8) 
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3072-6, describe the use of biotinylated probes, although these are duplex probes, that are 
inunobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be 
purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating 
any surface with streptavidin. Biotinylated probes may be purchased from various sources, 

5 such as, e.g., Operon Technologies (Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be 
used. Nunc Laboratories have developed a method by which DNA can be covalently bound 
to the microwell surface termed Covalink NH. CovaLink NH is a polystyrene surface 
grafted with secondary amino groups (>NH) that serve as bridge-heads for further covalent 

10 coupling. CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules 
maybe bound to CovaLink exclusively at the 5 -end by a phosphoramidate bond, allowing 
immobilization of more than 1 pmol of DNA (Rasmussen et al, (1991) Anal. Biochem. 
198(1) 138-42). 

The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end 
15 has been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond 
is employed (Chu et al, (1983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as 
inunobilization using only a single covalent bond is preferred. The phosphoramidate bond 
joins the DNA to the CovaLink NH secondary amino groups that are positioned at the end 
of spacer arms covalently grafted onto the polystyrene surface through a 2 nm long spacer 
20 aim. To link an oligonucleotide to CovaLink NH via an phosphoramidate bond, the 

oligonucleotide terminus must have a 5'-end phosphate group. It is, perhaps, even possible 
for biotin to be covalently bound to CovaLink and then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/^ul) 
and denaturing for 10 min. at 95^C and cooling on ice for 10 min. Ice-cold 0.1 M 
25 1-methylimidazole, pH 7.0 (1 -Melm?), is then added to a final concentration of 1 0 mM 
l-Melm?. A ss DNA solution is then dispensed into CovaLink NH strips (75 ul/well) 
standing on ice. 

Carbodiimide 0.2 M l-ethyl-3-(3-dimethylaminoprop>i)-carbodiimide (EDC), 
dissolved in 10 mM l-Melm?, is made fi^h and 25 ul added per well. The strips are 
30 incubated for 5 hours at 50°C. After incubation the strips are washed using, e.g., 

Nunc-Immimo Wash; first the wells are washed 3 times, then they are soaked with washing 
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solution for 5 min., and finally they are washed 3 times (where in the washing solution is 0.4 
N NaOH, 0.25% SDS heated to 50^C). 

It is contemplated that a fiirther suitable method for use with the present invention is 
that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated 

5 herein by reference. This method ofpreparing an oligonucleotide bound to a support ! 
involves attaching a nucleoside 3 -reagent through the phosphate group by a covalent 
phosphodiester link to aliphatic hydroxyl groups carried by the support. The 
oligonucleotide is then synthesized on the supported nucleoside and protecting groups 
removed from the synthetic oligonucleotide chain under standard conditions that do not 

10 cleave the oligonucleotide from the support. Suitable reagents include nucleoside 
phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA 
probe arrays may be employed. For example, addressable laser-activated photodeprotecdon 
may be employed in the chemical synthesis of oligonucleotides directly on a glass surface, 

15 as described by Fodor et al (1991) Science 251(4995) incorporated herein by 

reference. Probes may also be immobilized on nylon supports as described by Van Ness et 
al. (1991) Nucleic Acids Res. 19(12) 3345-50; or linked to Teflon using the method of 
Duncan & Cavalier (1988) Anal. Biochem. 169(1) 104-8; all references being specifically 
incorporated herein. 

20 To link an oligonucleotide to a nylon support, as described by Van Ness et al 

(1991), requires activation of the nylon surface via alkylation and selective activation of the 

5*-amine of oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 

light-generated synthesis described by Pease et al, (1994) PNAS USA 91(1 1) 5022-6, 
25 incorporated herein by reference). These authors used current photolithographic techniques 

to graerate arrays of inunobilized oligonucleotide probes (DNA chips). These methods, in 

which light is used to direct the synthesis of oligonucleotide probes in high-density, 

miniaturized arrays, utilize photolabile 5 -protected Macyl-deoxynucleoside 

phosphoramidites, surface linker chemistry and versatile combinatorial synthesis strategies. 
30 A matrix of 256 spatially defined oligonucleotide probes may be generated in this manner. 
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3.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, 
genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC 
inserts, and RNA, including mRNA without any amplification steps. For example, 
Sambrook et al (1989) describes three protocols for the isolation of high molecular weight 
DNA from mammalian cells (p. 9.14-9.23). 

DNA fragments may be prepared as clones in M13, plasmid or lambda vectors 
and/or prepared directly from genomic DNA or cDNA by PGR or other amplification 
methods. Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of 
DNA samples may be prepared in 2-500 ml of final volume. 

The nucleic acids would then be firagmented by any of the methods known to those 
of skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 
of Sambrook et aL (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1990) 
Nucleic Acids Res, 18(24) 7455-6, incorporated herein by reference). In this method, DNA 
samples are passed through a small French pressure cell at a variety of low to intermediate 
pressures. A lever device allows controlled application of low to intermediate pressures to 
the cell. The results of these studies indicate that low-pressure shearing is a usefiil 
alternative to sonic and enzymatic DNA fragmentation methods. 

One particularly suitable way for fiagmenting DNA is contemplated to be that using 
the two base recognition endonuclease, Cv/JI, described by Fitzgerald et al, (1992) Nucleic 
Acids Res. 20(14) 3753-62. These authors described an approach for the rapid 
fiagmentation and fi^tionation of DNA into particular sizes that they contemplated to be 
suitable for shotgun cloning and sequencing. 

The restriction endonuclease CviJI normally cleaves the recognition sequence 
PuGCPy between the G and C to leave blunt ends. Atypical reaction conditions, which alt^ 
the specificity of this enzyme (CV/JI**), yield a quasi-random distribution of DNA 
fragments form the small molecule pUC19 (2688 base pairs). Fitzgerald et aL (1992) 
quantitatively evaluated the randomness of this Augmentation strategy, using a CwJI** 
digest of pUC19 that was size fractionated by a rapid gel filtration method and directly 
ligated, without end repair, to a lac Z minus Ml 3 cloning vector. Sequence analysis of 76 
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clones showed that CviJI** restricts pyGCPy and PuGCPu, in addition to PuGCPy sites, and 
that new sequence data is accumulated at a rate consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead 
5 of 2-5 ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or 
agarose gel electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or 
prepared, it is important to denature the DNA to give single stranded pieces available for 
hybridization. This is achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. 
10 The solution is then cooled quickly to 2°C to prevent renaturation of the DNA fragments 
before they are contacted with the chip. Phosphate groups must also be removed fix)m 
genomic DNA by methods known in the art. 

3.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon 

15 membrane. Spotting may be performed by using arrays of metal pins (the positions of which 
correspond to an array of wells in a microtitCT plate) to repeated by transfer of about 20 nl of 
a DNA solution to a nylon membrane. By offset printing, a density of dots higher than the 
density of the wells is achieved. One to 25 dots may be accommodated in 1 mm\ 
depending on the type of label used. By avoiding spotting in some preselected number of 

20 rows and columns, separate subsets (subarrays) may be formed. Samples in one subairay 
may be the same genomic segment of DNA (or the same gene) from different individuals, or 
may be different, overlapped genomic clones. Each of the subarrays may represent replica 
spotting of the same samples. In one example, a selected gene segment may be amplified 
from 64 patients. For each patient, the amplified gene segment may be in one 96-well plate 

25 (all 96 wells containing the same sample). A plate for each of the 64 patients is prepared. By 
using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. Subarrays 
may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm^ and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, NaperviUe, 

30 Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the 
membrane, the grid being similar to the sort of membrane applied to tiie bottom of multiwell 
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plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by 
exposure to flat phosphor-storage screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration 
of the present disclosure, one of skill in the art will s^preciate tiiat many other embodiments 
and variations may be made in the scope of the present invention. Accordingly, it is 
intended that the broader aspects of the present invention not be limited to the disclosure of 
the following examples. The present invention is not to be limited in scope by the 
exemplified embodiments which are intaided as illustrations of single aspects of the 
invention, and compositions and methods which are fimctionally equivalent are within the 
scope of the invention. Indeed, numerous modifications and variations in the practice of the 
invention are expected to occur to those skilled in the art upon consideration of the present 
preferred embodiments. Consequently, the only limitations which should be placed upon 
the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby 
incorporated by reference in their entirety. 

4.0 EXAMPLES 

4.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared fix>m 
various human tissues and in some cases isolated fi-om a genomic library derived from 
human chromosome using standard PGR, SBH sequence signature analysis and Sanger 
sequencing techniques. The inserts of the library were amplified with PCR using primers 
specific for the vector sequences which flank the inserts. Clones fix>m cDNA libraries were 
spotted on nylon membrane filters and screened wifli oligonucleotide probes (e.g., 7-mers) 
to obtain signature sequences. The clones were clustered into groups of similar or identical 
sequences. Representative clones were selected for sequencing. 

In some cases, the 5* sequence of the amplified inserts was then deduced using a 
typical Sanger sequencing protocol. PCR products were purified and subjected to 
fluorescent dye terminator cycle sequencing. Single pass gel sequencing was done using a 
377 Applied Biosystems (ABI) sequencer to obtain the novel nucleic acid sequences. In 
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some cases RACE (Random Amplification of cDNA Ends) was performed to fiirther extend 
the sequence in the 5' direction. 

4.2 EXAMPLE 2 
Novel Nucleic Acids 

5 The novel nucleic acids of the present invention of the invention were assembled 

fipom sequences that were obtained from a cDNA library by mediods described in Example 
1 above, and in some cases sequences obtained from one or more public databases. The 
nucleic acids were assembled using an EST sequence as a seed Then a recursive algorithm 
was used to extend the seed EST into an extended assemblage, by pulling additional 

10 sequences bom different databases (i.e., Hyseq's database containing EST sequences, 
dbEST version 119, gb pri 119, and UniGene version 1 19) that belong to this assemblage. 
The algorithm terminated when there was no additional sequences from the above databases 
that would extend the assemblage. Inclusion of component sequences into the assemblage 
was based on a BLASTN hit to the extending assemblage with BLAST score greater than 

1 5 300 and percent identity greater than 95%. 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its conresponding protein sequence were generated from the assemblage. Any 
frame shifts and incorrect stop codons were corrected by hand editing. During editing, the 
sequence was checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 

20 121, gj) pri 121, UniGene version 121, Genpept release 121) and the amino acid version of 
Genseq released February 15, 2001. Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed- 
ready, ed-ext and cg-zip-2 (Hyseq, Inc.). The fulMength nucleotide and amino acid 
sequences, including splice variants resulting from these procedures are shown in the 

25 Sequence Listing as SEQ ID NOS: U 526. 

Table 1 shows the various tissue sources of SEQ ID NO: 1-526. 
The nearest neighbor results for polypeptides encoded by SEQ ID NO: 1-526 (i.e. 
SEQ ID NO: 527 - 1052) were obtained by a BLASTP (version 2.0al 19MP-WashU) 
search against Genpept, Geneseq and SwissProt databases using BLAST algorithm. The 

30 nearest neighbor result showed the closest homologue with frinctional annotation for SEQ 
ID NO: 527 - 1052. The translated amino acid sequences for which the nucleic acid 
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sequence encodes are shown in the Sequence Listing. The homologues with identifiable 
functions for SEQ ID NO: 527 - 1052 are shown in Table 2 below.Using eMatrix 
software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. BioL, Vol. 6 
pp. 219-235 (1999) herein incorporated by reference), polypeptides encoded by SEQ ID 

5 NO: 1-526 (i.e. SEQ ID NO: 527 - 1052) were examined to determine whether they had 
identifiable signature regions. Table 3 shows the signature region found in the indicated 
polypeptide sequences, the description of the signature, the eMatrix p-value(s) and the 
position(s) of the signature within the polypeptide sequence. 

Using the Pfam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 

10 26(1) pp. 320-322 (1998) herein incorporated by reference) polypeptides encoded by 
SEQ ID NO: 1-526 (i.e. SEQ ID NO: 527 - 1052) were examined for domains with 
homology to certain peptide domains. Table 4 shows the name of the domain found, the 
description, the product of all the e-value of similar domains found, the pFam score for 
the identified domain within the sequence, number of similar domains found, and the 

15 position of the domain in the SEQ ID NO: being interrogated. 

The GeneAtlas™ software package (Molecular Simulations Inc. (MSI), San 
Diego, CA) was used to predict the three-dimensional structure models for the 
polypeptides encoded by SEQ ID NO: 1-526 (i.e. SEQ ID NO: 527 - 1052). Models 
were generated by (1) PSI-BLAST which is a multiple alignment sequence profile-based 

20 searching developed by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) 
High Throughput Modeling (HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) 
which is an automated sequence and structure searching procedure 
(http://www.msi.com/) . and (3) SeqFold™ which is a fold recognition method described 
by Fischer and Eisenberg (J, Mol. BioL 209, 779-791 (1998)). This analysis was carried 

25 out, in part, by comparing the polypeptides of the invention with the known NMR 

(nuclear magnetic resonance) and x-ray crystal three-dimensional structures as templates. 
Table 5 shows, "PDB ID", the Protein DataBase (PDB) identifier given to template 
structure; "Chain ID", identifier of the subcomponent of the PDB template structure; 
"Compound Information", information of the PDB template structure and/or its 

30 subcomponents; "PDB Function Annotation" gives fimction of the PDB template as 

annotated by the PDB files (http:/www.rcsb.org/PpB/) : start and end amino acid position 
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of the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold score, 
and the Potential(s) of Mean Force (PMF). The verify score produced by GeneAtlas™ 
software (MSI), is based on Dr. Eisenberg's Profile-3D threading program developed in 
Dr. David Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and 
5 Eisenberg, Nature, 356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc. 
Natl. Acad. Sci. USA, 95:12502-13597. The verify score produced by GeneAtlas 
normalizes the verify score for proteins with different lengths so that a unified cutoff can 
be used to select good models as follows: 

1 0 Verify score (normalized) = (raw score - 1/2 high score)/(l/2 high score) 

The PMF score, produced by GeneAtlas™ software (MSI), is a composite scoring 
function that depends in part on the compactness of the model, sequence identity in the 
alignment used to build the model, pairwise and surface mean force potential (MFP). As 
1 5 given in Table 5, a verify score between 0 to 1 ,0, with 1 being the best, represents a good 
model. Similarly, a PMF score between 0 to 1.0, with 1 being the best, represents a good 
model. A SeqFold™ score of more than 50 is considered significant. A good model may 
also be determined by one of skill in the art based on all the information in Table 5 taken 
in totality. 

20 The nucleotide sequence within the sequences that codes for signal peptide 

sequences and their cleavage sites can be determined from using Neural Network SignalP 
Vl.l program (fi-om Center for Biological Sequence Analysis, The Technical University 
of Denmark). The process for identifying prokaryotic and eukaryotic signal peptides and 
their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, Soren 

25 Brunak, and Gunnar von Heijne in the publication " Identification of prokaryotic and 
eukaryotic signal peptides and prediction of their cleavage sites" Protein Engineering, 
Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and 
a mean S score, as described in the Nielson et al., as reference, were obtained for the 
polypeptide sequences. Table 6 shows the position of the last amino acid of the signal 

30 peptide in each of the polypeptides and the maximum score and mean score associated 
with that signal peptide. 
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Table 7 correlates each of SEQ JD NO: 1*526 to a specific chromosomal location. 

Table 8 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 
1-526, novel polypeptide sequences SEQ ID NO: 527 - 1052, and their corresponding 
priority nucleotide sequences in the priority application USSN 09/810,173, herein 
5 incorporated by reference in its entirety. 
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Table 1 



Tissue Origin 


RIS A/Tissue 
Source 


Library 
Name 


SEQ ID NO: 


adipocytes 


Stratagene 


ADPOOl 


39 49 68 84 103-104 1 17 124 186 188-189 221 247 272 307 312 336- 
337 353 356 369 434 461 495 509 


adrenal gland 


Clontech 


ADR002 


11 14 25 30 39 83 90 92 100 108 111 131 133 137 144 148 155 164 
170-173 184 196 206 244-245 254 260 266 273 301 317 330 349 359 
383 392 397-398 401 411-414 423 442 466 468 486 510-511 518 


adult brain 


BioOiain 


ABR012 


47262 


adult brain 


BioChain 


ABR013 


60 205 


adult brain 


Clontech 


ABROOl 


17 39 55 61 95 124 137 153 186 233 247 252 287-288 307 322 353 
377 380 388 412-414 448 482 505-506 511 


adult brain 


Clontech 


ABR006 


9 17 26 32 38 41 61 77 81 83 87 95 106 117 134 137 143 153-154 158 
163 175-176 179 181 193 205 217 224-227 235 254 257 262 277 308 
340 342 359 369 376 389-391 419 433 442 446-447 458-461 466 474 
482 484 497-498 509 512 515 


adult brain 


Clontech 


ABROOS 


2 4 7 12 17-18 24-25 28-29 32 35-38 44 46-48 50 57 62-63 65-68 70 
74-75 77 84 96 101 103-104 107-109 112-113 117 120 125 127 144 
151-153 158 163 166-167 170-175 178 181-182 185 187 191 193 196 
200-201 204 209-210 223 225 231 239-243 247-248 257 259 262 264- 
266 276-277 279-280 282 289-290 311-312 321-322 326 331 337-338 
342 346-347 349 353 356 358 360 366 369 375 380 389-391 405 408 
41 1-414 426-427 442 449 452-454 456 458 463 473-476 480 482 489 
493 495 498 503 505-506 510-512515521 


adult brain 


Clontech 


AERO 11 


394 


adult brain 


GIBCO 


AB3001 


9 13 21 32 34 49 58 61 77 92 98 124 138-141 154 205 248 254 282 
289 298 309 323 326 342 371 412-414 461 475 


adult brain 


GIBCO 


ABD003 


9 15-16 18 24 26 32 34 39 54 60-61 66 68 79 96 98 109 1 12 1 17 120 
124 131 140 143-144 153-154 162 170-173 181 195-196 201 205 223 
231 233-234 252 257 273 287-288 298 300 313 317 323 326 345-346 
369 371 376 379 383-384 386 397 405 41 1-414 418 442 495 497 501 
511 521 


adult brain 


Invitrogen 


ABR014 


65 125 184 247 307 338 467 490 509 513 


adult brain 


Invitrogen 


ABR015 


12 34 60 73 127 140 287 417 445 


adult brain 


Invitrogen 


ABR016 


3 24 34 136 177 248 307 452 474 


adult brain 


Invitrogen 


ABT004 


29 3947 65-66 83 87 97 107 143 151-152 156 163 166-167 193 196 
217 220-221 254 266 281 307 317 334 378 382 389 397 412-414 430 
473 509 


adult heart 


GIBCO 


AHROOl 


5-6 1 1 15-16 18-20 23 34 39 41 48 50 62-63 65 70 77 84 86 92 95-100 
103-104 107 109 111 114 118 124-125 127 142-144 154 162 165-167 
170-175 178 181-182 186 188 191 193-197 200 206-207 217 221 224 
228 247 257 266 273-275 281 287-288 317 337 340 346 353 355 362- 
363 369 374 376-377 382 384-385 390-391 397-398 400 411-414 423 
434 440 474 482 489 498 500-502 509-5 10 513 


adult kidney 


GIBCO 


AKDOOl 


5-6 1 1-12 14-16 19 22 24 27 32 34 39 41 46-47 49 51 53 55 58 62-63 
68 77 80 83-84 91-92 98 100-107 1 10 1 16 119 125-127 137 144-147 
154 160 162 165 178 181-182 188-189 193 207 210 215-217 231-233 
240 247-249 254 257 264 273-274 287-288 298 306 32 1 323 326 330 
334 340 342 346 353 367 371 376 382 384-385 394 397 400 41 MM 
429-430 444 446-447 456 461 467 474-475 482 489 495-498 509-51 1 
514 516 524-525 


adult kidney 


Invitrogen 


AICT002 


1 18 27 34 58 66 77 101 107 124 129 131 136-137 146 155 181-182 
196 206 217 264 266 274-275 288 291 320 326 334 375-376 394 400- 
401 408 41 1-414 418 423 435-437 444 452 458 473 481-482 501 504 
509 519 


adult liver 


Clontech 


ALV003 


32 74-75 94 137 247420516 
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Table 1 



Tissue Origin 


RN A/Tissue 
Source 


Library 
Name 


SEQ ID NO: 


adult liver 


InvinDgen 


ALV002 


6 12 18 23 25 34 49 65 74-75 80 87 94 98 1 18 122 133 137 15 M52 
163 170-173 lo6 197 223 236 246-247 254 Z5o 266 2o5 32o 344 
370 383 387 399 412-414 452 456 460 462 466 473 475 497 519 


adult lung 


GIBCO 


ALGOOl 


15-16 18 27 34 47 65 72 74-75 83 92 127 137 155 185 210-212 215- 
216 248 288 318 326 331 337 382 400 434 461 474 492 495 516 


adult lung 


Invitrogen 


LGT002 


5 1 M2 14 16 18-19 24 26 30 32 34-36 39 46 49 55 57 66-67 73-75 80 
84 9297-99 103-105 108 112 120 124^125 134 150 166-167 169-173 
179-180 182 186 188 193 196 202-203 210 212 215-217 221 225 231 
246-247 254 256 266 273 281-282 288 307 309 317-318 326 331 335 
342 346 34o 353 356 365-366 il^-^lo 3ol 3o5 389 397-3^0 41 1-414 
418 426 434 452 456 475 489-490 495 501 503-504 508-509 521 


aduh spleen 


Clontecb 


SPLcOl 


17 22 25 54 71 108 117 121 130 133 153 184 207 226-227 254 257 
281-282 331 346 364 384 398 406 416 461 512 


adult spleen 


GIBCO 


ASPOOl 


15-16 22 24 26 34 41 77 96 103-104 107 111-112 121 124 142 144 
155 158 163 182 206-207 215-216 255 281 287 326 337 342 354 370 

'yt\Q A i % A t A A'y A AC^ Al'i A^A At\C C t ^ 

398 411-414 434 456 473-474 495 511 


bladder 


Invitrogen 


BLD001 


35-36 77 103-104 124 144 218 281 287 337 367 369 376 430 434 460 

509 


bone marrow 


Clonetech 


BMD007 


32 


bone marrow 


Clontech 


BMDOOl 


2 5 912 15 17-18 20 24-25 27 30 34 38 54-58 68-72 77 88-91 95 103- 
104 110 112 122 124 155 162 165 176 178 181-182 186 188 193 199 
204 215-217 221 230 233 246 254 274 288 292 305 307 309 326 331 
340 342 349 364 376 379 389-391 401 411-414 416441 446-448 489 
497-498 500 503 513-514 516 518 524 


bone marrow 


Clontech 


BMD004 


346460 


bone marrow 


GF 


BMD002 


4 17-18 23-25 27-28 30 32 35-36 38 47 51 53 57 71 74-75 77 87 90- 
92 95 103-104 107-108 113 117 122-125 133 137 148-149 151-152 
154-155 170-173 178 181-182 184 186 189 191 196 198 209 215-216 
221 231 233 250254 266 272 276 281 283 287 301 317 326 330-331 
337 342 346 349 356 364-366 371 379 392 394 396 402 406 408 41 1- 
414 421-422 433 435-438 442 461 467-468 475 489 495 498 501 503 
505-506 509-5 10512514517-518 


cervix 


BioChain 


CVXOOl 


5-6 18 20 24 30 32 42 44 55-56 66 68 72 84 92 96 99-100 110-111 
120 131 134 137 144 146 151-152 162 165 170-173 175-176 181-182 
184 186 190 1 93 195 197 207 210 214-216 238 246-247 254 266 272- 
273 275 282 287 291-293 298 317 321 323 326 333 340 342 353 355 
365 367 369-370 378 382 41 1-414 4 18 423 434 438 452 456 458 460- 
465 473-474 476 479 492 498 500 504 507 510 524 


colon 


Invitrogen 


CLNOOl 


11 13 34 81 100 105 126 184 186 196 254 317 328 330349 400 412- 
414 426 460 466 5 10 525 


diaphragm 


BioChain 


DIA002 


226-227 


endothelial 
celb 


Strategene 


EDTOOl 


2 13-14 16-19 22 24 26-27 30-31 34-36 47 49 53 58 62-63 65-68 73 
80 83 85-86 92 96 98 100 102 106-108 114 117-118 125-126 132 137 

\A1 f^fi.1dO \fn 17rt-17^ I7S \9.\A%7. 1jtX.190 
I'M, X*¥k l*tO-l*»3r lO*t iQO-lD/ If J I/O I0I-10£ JOO-17V 

196-197 206 213-214 217 221 231 246-247 254 257 266 273 288 306- 
307 309 313 318 323 326 334 337 340 342 355 366 369 371 375-376 
379-382 389 400 406 409 41 1-414 423 426 429 431 440 445 452 456 
461 467-468 474 482 490 503-504 508-510 514 516 


fetal brain 


Clontech 


FBROOl 


39 87 247 353 375 452 460 513 


fetal brain 


Clontech 


FBR004 


181205 393 


fetal brain 


Clontech 


FBR006 


1 7-8 12 17-19 24 27 29-30 32 34-36 46-49 53 58 62-63 70 77-78 85 
95-96 103-104 107-108 120 125 127 134 151-153 164 166-167 175- 
176 182 184-185 189 196 201 204 217 223 225 229 231 242 245 247 
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Tissue Origin 


RN A/Tissue 
Source 


Library 
Name 


SEQIDNO: 








253-254 264 266 269-270 275 280-281 287 294-295 304-305 321 326 
329 331 346 353 355-356 359 369 375 379 381 389-391 394 411 418 
423-427 430 440 442 445 449-450 452 454 456 461 463 469 474 478 
48 1 -482 493 495 502 504-506 511-512518 


fetal brain 


GIBCX) 


HFBOOl 


5 1 1-12 15-16 18 20 24 27-28 30 34-36 46 53 58-67 69 84 92 97 100 
112 114 120 124 128 134-136 138 143 151-154 159-167 182 184 186 
188-190 193 196 205 207 217 221 223 233 248 264 266 273-274 282 
285 287 305 307-308 326 338 340 342 349 366-367 371 375 379 389- 
391 397 400 41 1-414 431 442 452 467 476 480 482 489 492 497-498 
503 508-509 511 


fetal brain 


Invitrogen 


FBT002 


13 15 18-19 25 37 42 46 60 65-66 74-75 118 132 137 140 150-153 
175 185 196 203 222-223 235 247-248 266 298 307 331 353 366 382 

'tAT jIIA AC^ A0\ jIOI AtiC CAO CAA C11 

397 430 452 481-482 495 508-509 511 


fetal heart 


Invitrogen 


FHROOl 


6 15 18-19 24 26 29 37 46 57-58 74-75 77-7o 81 9o lUi-lW U4 iz/ 
134-135 151-153 160 164 178 181 184 186 191 201 204-205 207 224 
242 245 247 253-254 257 273 276 281-282 287-288 309 312 317 326 

1 o tc^ tc£. "yci nA "ync "iin ic\f\ 70i 70/1 A{\t\ A(\fk A(\9. Ay 1 A\A 
338 353 356 3o3 370 Slb-Sl / 3o2 3yU-iyi 3^4 huU *IU0 '^Uo 4 1 

427-430 439^ 453 474 478 489-490 495 498 501 510-512 515 525 


fetal kidney Clontech 


FKDOOl 


1 1 in r\'\ rxn aa i t A9 •^Al tAC ttO t^£. ^Tl ^OT il A1 /111 ^iA AAS 

17 39 92 97 99 133 193 203-205 318 32o 371 3V/ 401 *h»6 


fetal kidney 


Clontech 


FKD002 


27-28 46 48-49 53 69-70 81 94 105 117 131 137 181-182 196 200 205 
221 226-227 247 254 258 329 337-338 373 381 397 415 431 451-452 
463 488 503 511-512 515 


fetal liver 


Clontech 


FLV002 


19 170-173 223 298 401 


fetal liver 


Clontech 


FLV004 


4 19 25-26 29 32 37^38 46 53 80-81 92 96 100-101 103-104 108 114 
124 127 136 153 178 181 184-185 199 208 215-216 257 272 287 298 
306 309 326 376 396 401 442 446 453 461 467 474 497 510 512 


fetal liver 


Invitrogen 


FLVOOl 


12 16 25 3244 60 77 80 117 137 144 188 230 246-247 266272 281 
298 342 353 382 401 412-414 449 460 482 495 519 


fetal liver- 
spleen 


Soares 


FLSOOl 


2-21 23-43 45-55 58 65 67 69-70 72-81 83 85-86 92-94 96-97 100 
103-108 110 115 120 124-125 131 133 137 144 146 149-155 158-159 
165 175 178 180-182 185-186 189 191-193 196 210 215-216 228-230 
238 246-248 254 264 266 272-273 282-283 285 288 292 298 305 307 
309 317-318 321 323 326 330 334-337 339-341 345-346 351 353 355 
359 365-366 370-371 375-376 382 3 84-386 389 395-402 41 1-414 426 
434 438 441-442 444 449 458 467 474-475 481-482 489-490 492 495 
497 501 503-512 5 14 516 519 522 525 


fetal liver- 
spleen 


Soares 


FLS002 


2-3 5-69 11-12 15-16 18-20 23-28 31 35-3638-41 47-49 51-55 57-60 
65 68 73-75 77 80 83 90 93 97-98 100-101 107-108 1 14 120 124 127- 
128 131 133 137 143-144 148 150-152 155 157 163 166-167 174 177 
179 181-182 184 187-191 196 200 215-216 226-227 229-231 241 

u/: 'yAo '>CA '>co '^ti toc '^Ot "^OO 'IA'7 1 1 n/S 7*5^ 7^0 7ylA 

246-248 254 25o 266 2/2 285 2o/-25o 30/J12j1o J20 JJj-3'#2 mo 
348 350-356 366 370-371 376 379 382 386 389 398 401 405 409 41 1- 

A \ A A'iA AA\ AAC A AO A A{\ AC^t A<Q At^t\ Aiiii ATI AIA^I^ >IQ1 ABO AfiO 

414 434 441-445 44o-44y 452 45o 400 4oo 4/ 1 4/4-4 /5 4ol-*H54 4oy- 

490497 501 516518 521 


fetal liver- 
spleen 


Soares 


FLSCX)3 


6 16 21 48 65 72 84 98 1 10 1 14 124 208 215-216 229 254 286 288 
307 317 336-337 356 366 370 397 401 405-408 434 444-447 455 493 
497-498 501 504-506 


fetal lung 


Clontech 


FLGOOl 


65 137 237 247 281 312 334 434 5 10 


fetal lung 


Invitrogen 


FLG003 


49 66 77 105 121 182 246-248 281 294 302 337 353 366 401 412-414 
460 


fetal muscle 


Invitrogen 


FMSOOl 


9 23 53 84 95 1 18 281 322-323 331 336 346 355 366 401 446-447 461 
473 498 509 519 


fetal muscle 


Invitrogen 


FMS002 


23 25 28-29 48 58 92 103-104 124 127 131-132 201 217 247 255 257 
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Tissue Origin 


RN A/Tissue 
Source 


Library 
Name 


SEQ ID NO: 








276 281 316 323 326 337 353 373 41 1 429 431 446-447 453-454 474 
490 498 502 512 519 


fetal skin 


Invitrogen 


FSKOOl 


5 9 15-16 18 24 26 28 30 32 35-36 40 48 62-63 66 77 87 95-96 98 
103-104 107-109 120 124 131-132 141 170-173 175 177 182 186 198 
204 226-227 235 251 266 273 281 285 287 295 302 309 313 320-328 
332-333 336 346 349 353 355 366-367 369 375 385-386 389-391 397 
401 41 1 434 442 452 456 460 467 501 509-510 512 


fetal skin 


Invttrogen 


FSK002 


4 24-2631 46 48 53 68 71 74-75 77 81 87 109-110 117 151-152 170- 
173 178 181-182 185-186 193 196 204 209 215-216 225-227 245 247 
253 275-276 287 307 326 328 331 333 337 353 369 373 379 390-391 
412-414 418 432 439-440 452-454 463 467 475 489 495-496 502-503 
505-506510 512 515 


fibroblast 
epilepsy 


JuHo_m 


EPMOOl 


357 


fibroblast 
epilepsy 


Julio_m 


EPM004 


357 


fibroblasts 


Julio m 


BACOOl 


484 


infant brain 


NULL 


IBM002 


13 42 48 61 77 170-173 184 190 308 444 456467 


infant brain 


NULL 


IBSOOl 


26 60 84 100 137 143 170-173 175 184 281 315 366 376 397 489 507 


infant brain 


Scares 


IB2002 


9 13 16 18 20 22 24 26 30-31 34 37-38 45 47-48 54 60-63 66 69 77 
80-81 83-84 95-9699 103-104 111 117 119 121 124-125 127 139-140 
154-155 160 162-163 168 175-176 179 182 184-185 196 200-201 205 
218 220 226-227 247 252 259-260 266 273 281 287-288 307-308 317 
326 331 337 340 342 346 349 353 365 369-371 375 383-384 390-391 
397-398 426 434 442 444 446-447 456 458 460-461 467 473-474 481 
489 492 495 497-498 501 505-507 509-51 1 525 


infant brain 


Scares 


IB2003 


2 13-14 17 24-25 30 38 49 61 66 77 87 95 107 130 137 140 143-144 
153-154 163 175-176 184-185 196 200-201 205 207 223 245 247-248 
254 259 266 273 281 287-288 307-308 317-318 326 331 338 341 346 
353 371 383-384 397 41 1 442 456 458 460 489 492 495 497 501 507 
510512515 


leukocytes 


Clontech 


LUC003 


5 77 1 12 137 165 181-182 272 307 376 416 453 508-509 512 


leukocytes 


GIBCO 


LUCOOl 


5 13-15 18-20 24-25 27 32 34 37 39 43 46-47 53 55-56 58 64 67-68 
70 74-77 84 87 96 101 103-104 108-115 123-126 131 135 137 143- 
144 150 153 155 164-167 169 178-179 181-182 184 188-190 196 200 
207 210 212 215-216 221 223 235 248 254 257 267 274 281-283 287 
302 306-307 309 312 316-317 321 326 331 337 340 342 349 364-366 
371 375-376 379 382 389-391 394 396-397 405-406 41 1-414 416 426 
429 434 442 444 452 457-458 464-465 467 470 489 495 501 503-506 
509 511-513 524 


lung 


Strategenc 


LFBOOl 


6 11 13 15 41 46 56 84 92 112 143 154 178 181 190 197 202 217 282 
307 312 336 365 389 456 474 482 484 504 


lymph node 


Clontech 


ALNOOl 


18 71 122 155 176-177 202 326 338 41 1 


lymphocyte 


ATCC 


LPCOOl 


etc <^A 'ic •>fk in AA cj cc T/l "Tl OT OO OA 1/17 119 117 19(1 125 151 
5 15 24-25 29 39 44 53-!>j /U-/ 1 o / i'O lU/ i 11/ iav iaj 1^1 

137 144 155 165 181-182 210 217 254 266-267 272 283 288 317 321 

342-343 346 365 370 375 379 384 394 396 411 442 448 453 461 467- 

468 474 478 493 496 501 503-504 513 


macrophage 


Invitrogen 


HMPOOl 


24 69 1 13 129 137 144 287 326 389 396 398 406 467 5 10 


mammary 
gland 


Invitrogen 


MMGOOl 


15-18 24-26 30 32 35-37 39 44 49 62-63 65-66 72 77-78 83 87 100- 
101 103-105 107 109 112 114 117 131-132 137 144 146 151-153 157- 
158 170-173 182 187-188 190 196-197 223 234-235 240 243 246-248 
254 266 272 281 283 287 298 300 302 317 319 326 330-333 337 341- 
342 353 355-356 371 375 380^382 385 397 400 41 1-414 423 434 442 
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445 452 456-457 460-461 465 473 475 495 501 507-510 516 519 521 
525 


melanoma 


Clontech 


MEL004 


18 39 50 73 92 118 124 127 208 212 247 285 303-304 317 322 326 
342 353 452473-474 492 


♦Mixture of 16 
tissues - 
mRNA 


Various 
Vendors 


CGdOIO 


60 77 94 322 338 473 478-479 496 5 19 


♦Mixture of 16 
tissues - 
mRNA 


Various 
Vendors 


CGdOll 


39 77 243 247 352 401 412-414 471 480 500 


♦Mixture of 16 

tissues- 

mRNA 


Various 
Vendors 


CGd012 


13 18 20 25-26 30 39 46 50 56 59 65 72 77 80-81 95 99 108 1 10 124 
144 148 189 194 215-216 225 232 241 243 247 284 287 299 326 331 
337 351-352 368 380 390-391 401 412-414 418 460 467 471 493 499- 
503 516 


♦Mixture of 16 

tissues - 
mRNA 


Various 

Vendors 


CGdOl3 


26 58 81 105 127 284 331 


♦Mixture of 16 
tissues - 
mRNA 


Various 
Vendors 


CGdOlS 


4 18 34 39 60 67 71 106 147 180 207 254 331 367 370-371 401 456 
497 501 503 507-509 512 


♦Mixture of 16 
tissues - 
mRNA 


Various 
Vendors 


CGd016 


2 29-30 77 112 131 143 175 184 248 259 307 335 359 397 401 409 
505-506524 


neuron 


Strategene 


NTDOOl 


3 8 11-13 45 69 77 79 81 131 137 139-140 166-167 179 207 295 307 
317 361 366423 444 497 514 520 


neuron 


Strategene 


NTROOl 


77 81 95 103-104 111 163 181-182 342 353 375 379 446-447 456 460 
467 495 


neuronal cells 


Strategene 


NTUOOl 


17 39 79 95 111 117 140151-152 182 266 305-306 358 369 373 375 
398 430 448 458 467 475 509 514 


ovary 


Invitrogen 


AOVOOl 


2 5-6 8-9 12 15-16 18 20 24-25 27-28 30 34 39 44 48 54 58 61 65 67- 
69 74-75 77 84 86-87 95 97-98 101 103-105 107 110-112 114 118 120 
125-127 131 134 137-138 142 144 148-150 153 155-156 162 164-165 
169-173 175-187 189-190 193 197 199-200 205 207 210 215-219 221 
225-228 231 246-247 254 264 266 272 274-275 281-282 288 298 307 
309 313 317-318 321 323 326 331 336-338 340-342 346 349 353 355- 
356 365-366 369-370 373-376 378 380-382 389 394 41 1-414 418 423 
434 442 444 452 455-456 458 467-468 473-474 477 481 489 492 496- 
497 500 504 507 509-510 515-516 521 524-525 


pituitary gland 


Clontech 


PIT004 


12 14 137 151-152 164 189 266 380 461 467 513 516 521 


placenta 


Clontech 


PLA003 


24 71 84 92 96 103-104 178 182 184 246 262 289 304 317 326 331 
333 337 385 431 433 440 452 493 503 511-512 


placenta 


Invitrogen 


APLOOl 


151-152 182 215-216 247 340 


placenta 


Invitrogen 


APL002 


24 34 49 80 83 107 1 12 125 153 190 247 353 397 400 510 


prostate 


Clontech 


PRTOOl 


1 C O OA t\C %t\C lit \*\A lie 141 101 1 0A 1 QiC 1>l£ ')jf O OQ 1 OOQ 

15 28 53 80 96 105 112 124-123 141 lol lo4 i9o^4o-Z4o £oi /Vo 
353 368 382 474 499 524 


rectum 


Invitrogen 


RECOOl 


18 78 80 83 105 196 226-227 248 266 275 281 296-297 321 366 369 

390-391 397-398 406 41 1-414 460 489 509-510 


saliva gland 


Clontech 


SALS03 


482 


salivary gland 


Clontech 


SALOOl 


25 39 41 124 202 268 299 338 340 353 355 365 381 411 418 430 489- 
490 498 501 516 


skeletal muscle 


Clontech 


SKMOOl 


1 1 23 182 186 217 226-227 247 353 378 386 41 1 498 513 525 


skin fibroblast 


ATCC 


SFBOOl 


16 


small intestine 


Clontech 


SINOOl 


12 18 20 24 26 30 35-36 39 48 53 62-63 74-75 86 92 99-100 105 107- 
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108 112-1 14 IZO 125 137 142 153-154 165 loV-173 1/5 loZ io5-loO 
204 206-207 210 215-216 221 229 231 246-247 249 251 254-255 266 
272 281-282 285 287 298 307 316 326 330 337 340 342 349 351 353 
356 367 369 371 376 382 385 389 394 396 400 403-404 41U 41Z-415 
417456-457 460-461 466 482 484 489 501 512-513 516 525 


spinal cord 


Clontech 


SPCOOl 


22 34 41 51 66 88 121 124 133 137 155 158 178 181-182 196 214-216 
229-230 247-248 250 254 261 269-271 281 318 353 367 369 371 416 

444 458 461 496 501 508 510-512 520 


stomach 


Clontech 


STOOOl 


9 16 55 86 165 251 254 274 282 323 355 385 410 434 457 482 501 
507 


testis 


GIBCO 


ATSOOl 


13-15 17-18 24 41 46 66 77 80 107-108 110 120 131 154 162 178 185 
233 246 272 281 287-288 306 317 342 365 394 400 41 1 418 427 434 
444 489495 504 509 


thalamus 


Clontech 


THA002 


32 39 60 68 126 137 144 154 185 190 247 252 254 273 308 321 341 

349 353 371 397 400430 466 475 521 


thymus 


Clonetech 


THMOOl 


14 17 25 28 30 34 39 49 53-54 61 76 87 100 124 128 137 15 1-152 158 
182 196 202 215-216 246-247 254 261 274 281 298 316 322-323 340 
346 349 353 364 366 369-371 376 384 389 408 41 1-414 438 444 455 
467 489 501 504 509 516 524 


thymus 


Clontech 


THMc02 


4 18 25 27 34-36 38 46-47 53-54 64 71 74-75 77 81 87-88 92 96 108 
137 155 170-173 180 184 196200202 211 225-227 229 231 233 239 
254 262 272 281 283-284 287 310 316 333 337 356 366 369 373 375- 
376 390-391 397 406 41 1 431 442 459-460 467 473-474 482 501 503 
509 512 516 518 520 524 


thyroid gland 


Clontech 


THROOl 


5 9 1 1-12 14 16-19 24-25 27 29-30 34 42 46-48 55 57-58 61 67 69 77 
88 92 96 100 114 120 124 128-129 131 133-134 137 151-155 165 
170-173 175 177 182 196 206 215-216 231 247 249 251 253-255 263- 
264 266 272-275 282 285 288 307 309 330-331 337 340 345 349 353 
365-367 369 371-372 376 381 396-397 409-414 433-434 440 444 452 
456 467 475 497 500 509 51 1 513 515 524-525 


trachea 


Clontech 


iKCUUI 


lo 2A /U liO 1 l*h 215-zlO ZJo ZDl zoO j05 jo J hjD *f07 ^ lU jxU j^h 


unibilical cord 


BioChain 


FUCOOl 


9 15 17-18 22 26 29-30 34 39 41 47 58 70 72 96 99 103-104 1 12 1 14 
120 124 128-129 151-152 157 161 170-173 182 186 207 215-216 228 
238 246-247 254 273 285 287 300 302 307 314 317 321 326 329-333 
336-338 342 353 367 369 378-379 382 389-391 401 406 444 448 452 
461 465 468 474 489 492 508-509 512 521 524 


uterus 


Clontech 


UTROOl 


47 84 1 1 1 114 197 21 1 246-247 273 281 307 353 384 412-414 442 
489 504 


young liver 


GIBCO 


ALVOOl 


15-1623 38 67 92 96 101 114 120 130 137 154 165 176 182 184 186 
209 254 337 340 366-367 382 405 41 1-414 429 452 474 497 



♦The 16 tissuc/mRNAs and their vendor sources are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA (Invitrogen). 4) Normal adult liver mRNA 
(Invitrogen), 5) Normal fetal kidney mRNA (Invitrogen), 6) Normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) Human bone marrow mRNA (Clontech), 10) 
Human leukemia lymphoblastic mRNA (Clontech), 1 1) Human thymus mRNA (Clontech), 12) human lymph node 
mRNA (Clontech), 13) human so\spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human 
esophagus mRNA (BioQiain), 16) human conceptional -umbilical cord mRNA (BioChain). 
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527 


gi9798452 


Homo sapiens 


mRNA for putative capacitative 
calcium channel (trp7 gene). 


4470 


100 


527 


gi5326854 


Mus musculus 


receptor-activated calcium channel 


4392 


98 


527 


gi2295903 


Homo sapiens 


Human putative calcium influx channel 
(htrp3) mRNA, complete cds. 


3529 


81 


528 


AAG89238 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 358. 


545 


100 


528 


AAG93320 


Homo sapiens 


NISC- Human protein HP 1 05 1 5. 


545 


100 


528 


gil3620915 


Homo sapiens 


bMRP63 mRNA for mitochondrial 
nbosomal protem bMRP63, con^lete 
cds. 


545 


100 


529 


AAW78211 


Homo sapiens 


HUMA* Human secreted protem 

encoded by gene 86 clone HTwCT03. 


333 


88 


529 


gi7294596 


alt 2 


CG4300 gene product [Drosophila 


65 


31 


529 


gi7294595 


altl 


CG4300 gene product [Drosophila 


65 


31 


530 


AAB95369 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 17686. 


2361 


100 


530 


gil0435142 


Homo sapiens 


cDNAFU13215fis, clone 
NT2RP4001447. 


2361 


100 


530 


gil6041164 


Macaca 
fascicularis 


hypodietical protein 


1576 


89 


531 


gil3625172 


Homo sapiens 


5-HT receptor mRNA. complete cds. 


1615 


93 


531 


gi 10503978 


Homo sapiens 


Clone SP329 unknown mRNA. 


1615 


100 


531 


gi7300419 


Drosophila 
melanogaster 


CG17796 gene product 


96 


27 


532 


gil0438219 


Homo sapiens 


cDNA: FLJ21986 fis. clone HEP06248. 


1425 


99 


532 


AA013496 


Homo sapiens 


H YSE- Human polypeptide SEQ ID 
NO 27388. 


. 1125 


99 


532 


ABBl 1720 


Homo sapiens 


HYSE- Human novel protein, SEQ ID 
NO:2090. 


725 


97 


533 


gi4929685 


Homo sapiens 


CGI- 108 protein mRNA, conq)lete cds. 


269 


98 


533 


gi 12838900 


Mus musculus 


putative 


269 


98 


533 


AAY65253 


Homo sapiens 


GEST Human 5' EST related 
polypeptide SEQ ID NO:1414. 


265 


96 


534 


gi220500 


Mus musculus 


NDPF-l protein 


65 


29 


534 


gi6679028 


Mus 

musculus] > 

[Mus 

musculus 


NPC derived proline rich protein 1 


65 


29 


535 


AAG02210 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6291, 


397 


98 


536 


gi7573295 


Homo sapiens 


Human DNA sequence from clone 
RP 1-238023 on chromosome 6. 
Contains part of the gene for a novel 
protein similar to PIGR (polymeric 
immunoglobulin receptor), part of the 
gene for a novel protein similar to rat 
SAC (soluble adenylyl cyclase), ESTs, 
STSs and GSS, complete sequence. 


389 


75 


536 


gi4 140400 


Rattus 

norvegicus 


soluble adenylyl cyclase 


176 


47 


536 


AAB81929 


Homo sapiens 


STRD Human soluble adenylyl 
cyclase. 


172 


45 
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0/ 
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537 


AAY10830 


Homo sapiens 


HUMA- Amino acid sequence of a 
human secreted protein. 


246 


100 


537 


gil3815145 


Sulfolobus 
solfataricus 


Hypothetical protein 


68 


37 


537 


gil5898682 


Sulfolobus 
solfataricus] > 
[Sulfolobus 
solfataricus 


Hypothetical protein 


68 


37 


538 


gil2841269 


Mus musculus 


putative 


503 


84 


538 


AAY13186 


Homo sapiens 


GEST Human secreted protein encoded 
by 5' EST SEQ ID NO: 200. 


406 


97 


538 


AAW67825 


Homo sapiens 


www *^ M A WW ^ f i _ • . 

HUMA- Human secreted protem 
encoded by gene 19 clone HELBW38. 


369 


100 


539 


AAS15817 aa 
1 


Homo sapiens 


S AAT/ Human cDNA encoding 
prostate specific protein SSH9. 


730 


94 


539 


AAU10191 


Homo sapiens 


S AAT/ Human prostate specific protein 
SSH9. 


730 


94 


539 


AAB58298 


Homo sapiens 


ROSE/ Lung cancer associated 
polypeptide sequence SEQ ID 636. 


730 


94 


540 


AAB43589 


Homo sapiens 


HUMA- Human cancer associated 
protem sequence SEQ ID NO: 1 034. 


913 


100 


540 


gi5817181 


Homo sapiens 


mRNA; cDNA DKFZp566E104 (from 
clone DKFZp5o6£104}; partial cos. 


745 


99 


540 


gi7512814 


Homo sapiens 


hypothetical protein DKFZp566E 104.1 
- human (fragment) > 


745 


99 


541 


AAB58235 


Homo sapiens 


ROSE/ Lung cancer associated 
polypeptide sequence SEQ ID 573. 


1480 


100 


541 


gi54 10296 


Homo sapiens 


homeobox prox 1 mRNA, con:q>lete 
cds. 


1267 


100 


541 


gi4929667 


Homo sapiens 


CQI-99 protein mRNA, complete cds. 


1267 


100 


542 


gi7108913 


Homo sapiens 


glucocorticoid receptor AF-l 
coactivator-1 mRNA, partial cds. 


1818 


100 


542 


AAM66710 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27016. 


513 


66 


542 


AAMS4312 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protem SEQ ID 
NO: 26417. 


513 


66 


543 


AAA61620_aa 
1 


Homo sapiens 


MITO- cDNA encoding human 
ubiquitin-conjugating enzyme rapUBC. 


275 


1 AA 
100 


543 


AAZ10849_aa 
1 


Homo sapiens 


DAND TIA-1 bmdmg protem 1 
(TlABPl)gene. 


275 


lUU 


543 


AAVS1398_aa 

1 
I 


Homo sapiens 


DAND Human TLVBPl genomic 


275 


100 


544 


AAB43887 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1332. 


1183 


100 


544 


gi533111 


Canis 
familiaris 


signal peptidase complex 25 kDa 
subunit 


1130 


95 


544 


gil 2856773 


Mus musculus 


putative 


1129 


95 


545 


gi6841242 


Homo sapiens 


HSPC296 


567 


99 


545 


gil2842164 


Mus musculus 


putative 


564 


97 


545 


gi7293870 


Drosophila 
melanogaster 


CG6884 gene product 


236 


45 
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546 


gi3043652 


Homo sapiens 


mRNA for KIAA0564 protein, partial 
cds. 


5065 


100 


546 


gi3875726 


Caencrhabdids 
elegans 


similar to nir like gene involved in 
denitrification-cDNA EST ykl2al.3 
comes from this gene-cDNA EST 
yk7c7.3 comes from this gcne-cDNA 
EST yk7c7.5 comes from this 
gcne-cDNA EST yk34c7.3 comes from 
this gene-cDNA EST ykI2al.5 comes 
from this gcne-cDNA EST yk24fl 2.5 
comes from this gene-cDNA EST 
yk34c73 comes from this genc--cDNA 
EST ykl54e5.3 comes from this 
gene-cDNA EST yk212dl0.3 comes 
from this gene-cDNA EST yk212dl0.5 
comes from this gcncM:DNA EST 
yk225b7.3 comes from this 
gene-cDNA EST yk225b7,5 comes 
from this gcnc-cDNA EST yk243b7.5 
comes from this gene-<DNA EST 
yk349d4.5 comes from this 
gene-cDNA EST yk367e8.3 comes 
from this gcne-cDNA EST yk367e8.5 
comes from this gene-^cDNA EST 
yk420G.3 comes from this 
gene-cDNA EST yk420O.5 comes 
from this gene~cDNA EST yk529f9.5 
comes from this genc^cDNA EST 
yk565dl0.5 comes from this gene 


1447 

• 


34 


546 


gi 10728542 


Drosophila 
melanogaster 


cl2.2 gene product 


1005 


56 


547 


gil2052936 


Homo sapiens 


mRNA; cDNA DKFZp566E2324 
(from clone DKFZp566E2324); 
complete cds. 


955 


100 


547 


gi 10439692 


Homo sapiens 


cDNA: nJ231 12 fis, clone 
LNG07874. 


580 


100 


547 


gi6692513 


Hepatitis B 
virus 


large S protein 


81 


32 


548 


AAY07902 


Homo sapiens 


HUMA- Human secreted protein 
fragment encoded from gene 5 1 . 


322 


88 


548 


gi4008342 


Caenorhabditis 
elegans 


predicted using Genefinder-contains 
similarity to Pfam domain: PF01496 
(V-type ATPase 1 l6kDa subunit 
femily). Scorc=925.6, E-valueM.oe- 
275, N=l~cDNA EST ykl5fl0.3 
comes from this gene^rDNA EST 
ykl5fl0.5 comes from this 
gcnc-cDNA EST yk224hll.3 comes 
from this gene-cDNA EST yk223dl.3 
comes from this gcnc-cDNA EST 
yk287c7.3 comes from this 
gcne-cDNA EST yk321hl 1.3 comes 
from this gcne-cDNA EST yk224hl 1.5 


66 


39 
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comes from this gene-cDNA EST 
yk223dl.5 comes from this 
gene-cDNA EST yk287c7.5 comes 
from this gcne-cDNA EST yk321hl 1.5 
comes from this gene 






548 


gi7496564 


Unknown 


hypothetical protein C26H9A.1 - 
Caenorfaahditis elegans > 


66 


39 


549 


gn7389834 


Homo sapiens 


Similar to RIKEN cDNA 2310035L15 
gene, clone MGC:23953 
1MAG£:4292862, mRNA» complete 
cds. 


1024 


ICQ 
• 


549 


gil2844552 


Mus musculus 


putative 


906 


89 


549 


AAM93823 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3881. 


727 


46 


550 


AAH26493_aa 
1 


Homo sapiens 


HOST- Human low density lipoprotein 
binding protein 1 (LBP-1) gene. 


697 


94 


550 


AAH26492 aa 
1 


Homo sapiens 


HOST- Human low density lipoprotein 
binding protein 1 (LBP-1) cDNA. 


697 


94 


550 


AAB82802 


Homo sapiens 


BOST- Human low density lipoprotein 
binding protein 1 (LBP-1). 


697 


94 


551 


gil2698332 


Homo sapiens 


C/EBP-induced protein mRNA, 
conqslete cds. 


2084 


100 


551 


gil41 50747 


Mus musculus 


GIG18 


641 


43 


551 


gi5739567 


Homo sapiens 


BAG clone RPl 1-505D17 from 7p22- 
p21, complete sequence. 


635 


44 


552 


gil 1761611 


Homo sapiens 


kinesin-like protein RBKINl (RBKIN) 
mRNA, con^>1ete cds, alternatively 
spliced. 


6087 


99 


552 


gil 1761613 


Homo sapiens 


kinesin-like protein RBKIN2 (RBKIN) 
mRNA, complete cds, alternatively 
spliced. 


5852 


96 


552 


gil2054030 


Homo sapiens 


mRNA for KINESIN-13A1 (KINI3A 
gene). 


5771 


95 


553 


gil7391063 


Homo sapiens 


Similar to RIKEN cDNA 1500032H18 
gene, clone MGC:21379 
IMAGE:4509694, mRNA, complete 
cds. 


1311 


100 


553 


gil2837824 


Mus musculus 


putative 


1083 


83 


553 


gi7292416 


Drosophila 
melanogaster 


CG1498S gene product 


383 


35 


554 


gil2857727 


Mus musculus 


putative 


1260 


94 


554 


gi6851256 


Mus musculus 


protein tyrosine phosphatase-like 
protein PTPLB 


1242 


93 


554 


AAB59515 


Homo sapiens 


HUMA- Human secreted protein 
BLAST search protein SEQ ID NO: 
104. 


1092 


100 


555 


AAM93439 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3078. 


1266 


100 


555 


gil674l367 


Homo sapiens 


clone MGC:17276 IMAGE:4180160, 
mRNA, complete cds. 


1266 


100 


555 


gil5079907 


Homo sapiens 


Similar to secretory carrier membrane 
protein 4, clone MGC:19661 
IMAGE:3 161979, mRNA, complete 


1266 


100 
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cds. 






556 


gil 6507984 


Human 
endogenous 
retrovirus 
K115 


putative env 


430 


48 


556 


gi4185944 


Human 
endogenous 
retrovirus K 


env protein 


429 


47 


556 


gi3 150438 


Human 
endogenous 
retrovirus K 


pol-env 


429 


47 


557 


AAB98212 


Homo sapiens 


NANF- Human early endosome antigen 
1 isomer (hEEAl-iso) SEQ ID NO:7. 


1129 


100 


557 


gi9963835 


Homo sapiens 


AD024 mRNA. complete cds. 


1129 


100 


557 


gil2834062 


Mus musculus 


putative 


717 


78 


558 


gil2847029 


Mus musculus 


putative 


1082 


76 


558 


AAY60569 


Homo sapiens 


META- Human nomoal bladder tissue 
EST encoded protein 241. 


1073 


100 


558 


gil2854670 


Mus musculus 


putative 


525 


80 


559 


gil5824269 


Homo sapiens 


NEDD4>]ike ubiquitin ligase 3 


64 


34 


559 


gi2662159 


Homo sapiens 


KIAA0439 


64 


34 


560 


AAB43895 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1 340. 


814 


100 


560 


gi5231141 


Homo sapiens 


sin3 associated polypeptide (SAP 18) 
mRNA, complete cds. 


804 


100 


560 


gi2108210 


Homo sapiens 


sin3 associated polypeptide pi 8 
(SAP 18) mRNA, complete cds. 


804 


100 


561 


gil 70618 11 


Homo sapiens 


C21orf57 isoform A protein (C21orf57) 
mRNA» partial cds, alternatively 

spliced. 


1102 


80 


561 


AAM25823 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO: 1338. 


938 


97 


561 


gil7061813 


Homo sapiens 


C21orf57 isoform B protein (C21orf57) 
mRNA, partial cds, alternatively 
spliced. 


804 


64 


562 


gn7061811 


Homo sapiens 


C2lorf57 isoform A protein (C21orf57) 
mRNA, partial cds, alternatively 

spliced. 


818 


75 


562 


AAM25823 


Homo sapiens 


HYSE- Human protein sequence SEQ 
IDNO:1338. 


687 


97 


562 


AAY48371 


Homo sapiens 


META* Human prostate cancer- 
associated protein 68. 


674 


96 


563 


AAB93239 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 12243. 


1630 


100 


563 


gil5928956 


Homo sapiens 


clone MGC:2295l IMAGE:4872309, 
mRNA, complete cds. 


1630 


100 


563 


gil4042582 


Homo sapiens 


cDNAFU14798ris, clone 
NT2RP4001313, weakly similar to 
MITOCHONDRIAL IMPORT 
RECEPTOR SUBUNIT TOM40. 


1630 


100 


564 


AAB94479 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:15153. 


1521 


100 


564 


gil0434979 


Homo sapiens 


cDNA FLJ13111 fis, clone 


1521 


100 
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NT2RP3002566. 






564 


gil4043295 


Homo sapiens 


clone IMAGE:3534358, mRNA, 
partial cds. 


1448 


100 


565 


gil5620831 


Homo sapiens 


mRNA for K1AA1886 protein, partial 
cds. 


1420 


99 


565 


gil3276647 


Homo sapiens 


mRNA; cDNA DKFZp761I2123 (from 
clone DKFZp761I2 123); complete cds. 


1420 


99 


565 


AAY86184 


Homo sapiens 


HELI- Nuclear transport protein clone 
hfb2007 protein sequence. 


1364 


99 


566 


gi4321787 


Mus musculus 


6*pyTUvoyl-tetrahydFopterin synthase 


156 


42 


566 


gil2832727 


Mus musculus 


putative 


156 


42 


566 


gi202561 


Rattus 

norvegicus 


6-pyruvoyl-tetrahydropterin synthase 


148 


41 


567 


gil3477179 


Homo sapiens 


hypothetical protein FLJ 10342, clone 
MGC:12937 IMAGE:2 820292, mRNA. 
complete cds. 


1036 


100 


567 


gil2804363 


Homo sapiens 


hypothetical protein FLJ 10342, clone 
MGC:4366 IMAGE:2822886, mRNA, 
complete cds. 


1036 


100 


567 


gil2653941 


Homo sapiens 


hypothetical protein FU 10342, clone 
MGC:2740 IMAGE:2822886, mRNA, 
complete cds. 


1036 


100 


568 


gi9280047 


Macaca 
fascicularis 


unname dprotein product 


596 


97 


568 


gil4532556 


Arabidopsis 
thaliana 


AT5g57360/MSFl9_2 


91 


33 


568 


gil3487068 


Arabidopsis 
thaliana 


Adagio 1 


91 


33 


569 


AAY87333 


Homo sapiens 


INCY- Human signal peptide 
containing protein HSPP-1 10 SEQ ID 
NOillO. 


543 


93 


569 


AAY12883 


Homo sapiens 


GEST Human 5* EST secreted protein 
SEQ ID NO:473. 


226 


86 


569 


AAY12868 


Homo sapiens 


GEST Human 5' EST secreted protein 
SEQ ID NO:458. 


168 


81 


570 


gil7389322 


Homo sapiens 


Similar to NICE- 5 protein, clone 
MGC:21212 IMAGE:3907760, mRNA, 
complete cds. 


130 


65 


570 


AAY73387 


Homo sapiens 


INCY- HTRM clone 3340290 protein 
sequence. 


122 


75 


570 


AAG73684 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:4448. 


76 


45 


571 


gi9280156 


Macaca 
fascicularis 


tinnamed protein product 


168 


82 


571 


AA011992 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 25884. 


76 


50 


571 


AAO08245 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 22137. 


70 


43 


572 


gil2666208 


Homo sapiens 


Human DNA sequence from clone 
RP11-103J18 on chromosome 13 
Contains ESTs, STSs, GSSs and a CpG 
island. Contains two novel genes and 
the 3' part of a novel eene similar to 


490 


100 
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mouse M025p complete sequence. 






572 


AAU099o4 


Homo sapiens 


vfiiLiLt- MUman cyname acaminasc-iiKc 
protein from clone 26934. 




100 


572 


AAG04055 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 8136. 


425 


100 


573 


gil2654927 


Homo sapiens 


clone IVHjv^:j->uy 1MALjc:34jJdzj, 
mRNA, complete cds. 




inn 


573 


gil 3905264 


Mus rausculus 


Similar to hypothetical protein 


1034 


85 


573 


gi9022437 


Xenopus 
laevis 


ashwin 


*H J 


•t A 


574 


gil3477177 


Homo sapiens 


Similar to RIKEN cDNA 1500032A17 
gene, clone MGC: 12936 
lMACjb:Z5ZiJUzz» mtuNA, con^>ieie 

cds. 


1128 


100 


574 


gil2851027 


Mus musculus 


putative 


1012 


89 


574 


AAG04038 


Homo sapiens 


O-bo 1 Human secreted protem, ocv 




Q9 


575 


AAG93293 


Homo sapiens 


NISC- Human protein HP 10659. 


1343 


100 


575 


gil5929856 


Homo sapiens 


oimilar to KIKJ^IN cuina uoiuui ijnzx 
gene, clone MGC:21397 
iiVLAvjcij6Dx*t4v;, mKJNA, compieic 
cds. 




too 


575 


gil309714j 


Mus musculus 


KiKcri Ci/INA uoiuui iiNz^ gene 


1 1 j\j 


82 


576 


AAO07956 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 


74 . 


38 


576 


gi5917666 


Zeamays 


extensin-like protein 


74 


40 


576 


gi3980411 


Arabidopsis 
tha liana 


putative proline*rich piotem 






577 


AAG89212 


Homo sapiens 


Uco 1 xiuman secreiea proicin, ocv^ ii^ 
NO: 332. 


Jit 




577 


gi49806lo 


Thermotoga 
maritima 


nypotneiicai proiein 


72 


36 


577 


gi9294037 


Arabidopsis 
thaliana 




67 


45 


578 


gil3324963 


Caenorhabditis 
elegans 


Hypodietical protein F37B4.9 


73 


41 


578 


gioo/yy^ / 


Mus muscuhis 


spnuigosine pnospnaic lydbc i 


65 


30 


579 


Ril2856429 


Mus musculus 


putative 


869 


66 


579 


gil6549784 


Homo sapiens 


cDNA FU30562 fis, clone 

dKA W tlzUU4 / J 1 . 


763 


99 


579 


gil2848379 


Mus musculus 


putative 


659 


62 


580 


AAY94520 


Homo sapiens 


iiNUi - numan lysmc-ncn siamenzi 
protein. 




Q6 


580 


gi438731 


Mesomys 
hispidus 


cytochrome b 


75 


39 


580 


gil478n2 


Sciurus abeiti 


cytochrome b 


73 


38 


581 


AAY73460 


Homo sapiens 


GEMY Human secreted protein clone 
ykl4 1 protein sequence SEQ ED 
NO:142. 


416 


100 


582 


AAY07790 


Homo sapiens 


HUM A- Human secreted protein 
fragment encoded from gene 47. 


294 


100 


582 


gi7l07077 


Porcine 


envelope glycoprotein 


63 


55 
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reproductive 
and respiratory 
syndrome 
virus 








582 


gil523l798 


Arabidopsis 
thaliana 


putative protein 


63 


34 


583 


gi633l397 


Homo sapiens 


mRNA for KIAA1287 protein, partial 

cds. 


6081 


99 


583 


gil20531l3 


Homo sapiens 


mRNA; cDNA DKF2p434H1220 
(from clone DKFZp434H1220); 

complete cds. 


6081 


99 


583 


gil 2850252 


Mus musculus 


putative 


1 CI 1 

1511 


93 


584 


gil3623583 


Homo sapiens 


clone IMAGE:3939163, mRNA, 
partial cds. 


610 


99 


584 


AAG01516 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5597. 


522 


98 


584 


gil 2654201 


Homo sapiens 


clone IMAGE:3449838, mRNA, 
partial cds. 


458 


100 


585 


gil65 19031 


Homo sapiens 


putative tetracycline transporter-like 
protein mRNA, complete cds. 


535 


99 


585 


gi2506078 


Mus musculus 


tetracycline transporter-like protein 


535 


99 


585 


Ktl2836216 


Mus musculus 


putative 


535 


99 


586 


gil6550027 


Homo sapiens 


cDNA FU30760 fis, clone 
FEBRA2000536, weakly similar to 
Homo sapiens paraneoplastic cancer- 
testis-brain antigen {MA5) mRNA. 


2043 


100 


586 


gil4043275 


Homo sapiens 


clone MGC: 15827 IMAGB:3507248, 
mRNA, complete cds. 


2043 


100 


586 


AAB 12529 


Homo sapiens 


SLOK Human Ma5 protein SEQ ID 
N0:13. 


754 


46 


587 


gi9929997 


Macaca 
fascicularis 


hypothetical protein 


856 


93 


587 


AAB45027 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 3. 


76 


52 


587 


gii3359187 


Homo sapiens 


mRNA for KIAA1657 protein, partial 
cds. 


73 


44 


588 


gil3559239 


Homo sapiens 


Human DNA sequence from clone 
RP5-842G6 on chromosome 20. 
Contains the 3* end of a novel gene, the 
3' end of the gene for a novel protein 
similar to SELIL (sel-l (suppressor of 
lin-12, C.clcgans)-like). ESTs, STSs 
and GSSs, complete sequence. 


815 


100 


588 


AAY36477 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene No. 23. 


/ 1^ 


t ^ 


588 


gil6769652 


Drosophila 
melanogaster 


LD45826p 


618 


54 


589 


gi9971051 


Homo sapiens 


Human DNA sequence &om clone 
RP11-526K24 on chromosome 20. 
Contains a novel gene, the 5' end of a 
novel gene, two CpG islands, ESTs, 
GSSs and STSs. complete sequence. 


585 


100 


589 


AAG01028 


Homo sapiens 


GEST Human secreted protein, SEQ ID 


579 


96 
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NO: 5109. 






589 


gi6782267 


Caenorhabditis 
elegans 


cDNA EST yk536gl 1.3 comes from 
this genC'-cDNA EST yk532dl 1.5 
comes from this gencM:DN A EST 
yk536gl 1.5 comes from this 
gene-cDNA EST yk642cl2.5 comes 
from this gene 


222 


CI 

51 


590 


ABB 12373 


Homo sapiens 


HYSE- Human bone marrow expressed 
protein SEQ ID NO: 128. 




oo 


590 


gil 2698103 


Macaca 
fascicularis 


hypothetical protein 


505 


96 


590 


AAG02711 


Homo sapiens 


GEST Human secreted protein, SEQ ID 

NO: 6792. 


At % 

411 


97 


591 


gil 4336677 


Homo sapiens 


t6pl3.3 sequence section 1 of 8. 


673 


lUU 


591 


gil 4327922 


Homo sapiens 


hypothetical protein FLJ22940, clone 
MGC: 14880 1MAGE:3946937, mRNA, 
conplete cds. 


673 


100 


591 


gil 2655063 


Homo sapiens 


polymerase (RNA) HI (DNA directed) 
polypeptide K (12.3 kDa), clone 
MGC:668 IMAGE:3051476. mRNA, 
complete cds. 


673 


lUU 


592 


gi9651lll 


Macaca 
fascicularis 


hypothetical protein 


495 


74 


592 


AAO06794 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 20686. 


110 


37 


592 


gi3882271 


Homo sapiens 


mRNA for K1AA0775 protein, 
complete cds. 


101 


29 


593 


gil2848554 


Mus musculus 


putative 




Oft 


593 


gi8655657 


Homo sapiens 


mRNA; cDNA DKFZp762O076 (from 
clone DKFZp762O076). 


1041 


100 


593 


gil 2804029 


Homo sapiens 


clone IMAGE:3940519. mRNA, 
partial cds. 


754 


51 


594 


gi2190184 


Homo sapiens 


mRNA for zinc finger protein, 
complete cds. 


616 


100 


594 


gil2803507 


Homo sapiens 


zinc finger protein, clone MGC:717 
IMAGE:3143091. niRNA. complete 
cds. 


616 


lOu 


594 


AAB58863 


Homo sapiens 


HUMA- Breast and ovarian cancer 
associated antigen protein sequence 
SEQ ID 571. 


599 


97 


595 


gil5080543 


Homo sapiens 


Similar to RIKEN cDNA 503 1425D22 
gene, clone MGC:2 1579 
IMAGE:4473003, mRNA, complete 

cds. 


1254 


lUU 


595 


AAY35940 


Homo sapiens 


GEST Extended human secreted 
protein sequence, SEQ ID NO. 189. 


1051 


99 


595 


gil2860261 


Mus musculus 


putative 


1007 


78 


596 


AAB43377 


Homo sapiens 


CURA- Human ORFX 0RF3 14 1 
polypeptide sequence SEQ ID 
NO:6282. 


807 


99 


596 


gil6877603 


Homo sapiens 


Similar to SNARE Vtila-beta protein, 
clone MGC:9292 IMAGE:3885564. 
mRNA, complete cds. 


711 


100 
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596 


gi3421062 


Mus musculus 


29-kDa Golgi SNARE 


700 


98 


597 


gil3384259 


Homo sapiens 


apolipoprotcin L6 mRNA, con^Iete 
cds. 


1550 


99 


597 


AAM93925 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 4091. 


1341 


100 


597 


gi6562077 


Homo sapiens 


Human DNA sequence from clone 
SC22CB-33F2 on chromosome 22 
Contains part of the gene for a novel 
protein similar to C-terminal parts of 
APOL (apolipoprotcin L) and TNF- 
inducible protein CG12-1 . Contains 
GSSs, complete sequence. 


1251 


100 


598 


AAG01189 


Homo sapiens 


GEST Human secreted protein, SEQ ED 
NO: 5270. 


301 


98 


598 


AAM40924 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 

NO 5855. 


106 


41 


598 


ABB11379 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO: 1749. 


106 


41 


599 


gil2848031 


Mus musculus 


putative 


504 


76 


599 


gil27]8388 


Neurospora 
crassa 


conserved hypothetical protein 


186 


37 


599 


gi9758240 


Arabidopsis 
tbaliana 




141 


27 


600 


AAG04048 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 8129. 


553 


100 


600 


AAM25836 


Homo sapiens 


HYSE- Human protein sequence SEQ 
IDNO:1351. 


501 


73 


600 


ABB15766 


Homo sapiens 


HUMA- Human nervous system related 
polypeptide SEQ ID NO 4423. 


365 


80 


601 


AAM25836 


Homo sapiens 


HYSE- Human protein sequence SEQ 
IDNO:1351. 


645 


77 


601 


AAG04048 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 8129. 


553 


100 


601 


AAQ02274 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6355. 


276 


96 


602 


gil7133695 


Nostoc sp. 
PCC7120 


WD-40 repeat-protein 


65 


45 


603 


gi7243278 


Homo sapiens 


mRNA for KIAA1440 protein, partial 
cds. 


2003 


100 


603 


gi7291723 


Drosophiia 
melanogaster 


CG3173 gene product 


1815 


34 


603 


gil3279125 


Homo sapiens 


clone IMAGE:3618123, mRNA, 
partial cds. 


1779 


100 


604 


AAY12244 


Homo sapiens 


GEST Human 5' EST secreted protein 
SEQIDNO: 557. 


378 


87 


604 


AAY59717 


Homo sapiens 


GEST Secreted protein 58-49-3-GlO- 
FLl. 


378 


87 


604 


gi2291l29 


Caenorhabditis 
elegans 


Hypothetical protein C02A12.5 


78 


30 


605 


gil5074866 


Tuber 
magnatum 


protein kinase C homologue 


82 


32 


605 


gi7110512 


Gallus gallus 


TGF-beta signal transducer SraadS 


79 


37 


605 AAM93694 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 


75 


63. 
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NO: 3606. 






606 


AAU16929 


Homo sapiens 


HUMA- Human novel secreted protein, 
SEQ ID 170. 


1118 


99 


606 


AAU17002 


Homo sapiens 


HUMA- Human novel secreted protein, 
SEQ ID 243. 


1117 


100 


606 


gil3623247 


Homo sapiens 


Similar to RIKEN cDNA 1 1 10001K21 
gene, clone MGC: 11275 
IMAGE:3944355, mRNA, complete 
cds. 


1082 


100 


607 


gii269S049 


Homo sapiens 


mRNA for KIAA1752 protein, partial 
cds. 


2706 


99 


607 


gi6103000 


Mus musculus 


fatso protein 


2384 


86 


607 


gil2855822 


Mus musculus 


putative 


463 


80 


608 


AAB93514 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 12846. 


312 


100 


608 


AAG01489 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5570. 


312 


100 


608 


AAW61552 


Homo sapiens 


ABBO Human endosulfine B protein. 


312 


100 


609 


gi 1 534 1686 


Homo sapiens 


clone MGC:20522 IMAGE:4578480, 
mRNA, complete cds. 


1695 


100 


609 


gil4349357 


Homo sapiens 


hypothetical protein FLJ22501 , clone 
MGC: 14897 IMAGE:3939754. mRNA, 
complete cds. 


1695 


100 


609 


gil04389l4 


Homo sapiens 


cDNA: FU22501 fis, clone 
HRC11368. 


1695 


100 


610 


AAM93816 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3867. 


1051 


95 


610 


gi9280104 


Macaca 
fascicularis 


unnamed protein product 


1035 


48 


610 


AAE07112 


Homo sapiens 


HUMA- Human gene 6 encoded 
secreted protein fragment, SEQ ID 
NO: 1 29. 


1033 


49 


611 


AAG93313 


Homo sapiens 


NISC- Human protein HP10569. 


365 


100 


611 


gil7389971 


Homo sapiens 


clone IMAGE:4251653, mRNA, 
partial cds. 


365 


100 


611 


AAG02098 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6179. 


300 


100 


612 


gi 12654899 


Homo sapiens 


Similar to x 006 protein, clone 
MGC:5294 IMAGE:3452502, mRNA, 
complete cds. 


1110 


100 


612 


AAB41932 


Homo sapiens 


CURA- Human ORFX ORF1696 
polypeptide sequence SEQ ID 
NO:3392. 


1091 


100 


612 


gi9437345 


Homo sapiens 


X 006 protein mRNA, complete cds. 


1022 


97 


613 


gill611571 


Macaca 
fascicularis 


hypothetical protein 


220 


89 


613 


gi9280196 


Macaca 
fascicularis 


unnamed protein product 


111 


34 


613 


gil2846582 


Mus musculus 


putative 


88 


28 


614 


AAG02925 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7006. 


275 


96 


614 


gi402I77 


Candida 
albicans 


Fatty acid synthase subunit beta 


65 


41 



139 



wo 02/074961 



Table 2 



PCT/US02/05109 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


V« 
Identity 


614 


gil592041 


Methanococcu 
s jannaschii 


conserved hypothetical protein 


65 


31 


615 


gi 15787978 


Homo sapiens 


nuclear export factor 3 (NXF3) mRNA, 
con^lcte cds. 


2824 


100 


615 


gil 1230440 


Homo sapiens 


mRNA for nuclear RNA export factor 3 
(NXF3 gene). 


2824 


100 


615 


gil2053833 


Homo sapiens 


partial mRNA for nuclear RNA eTqpori 
factor 3 (NXF3 gene). 


1794 


99 


616 


ei7770141 


Homo sapiens 


PRO 1728 


662 


100 


616 


gil69156 


Pisum sativum 


ribulose 1,5-bisphosphate carboxylase 
small subunit propeptide 


73 


25 


616 


gil7862888 


Drosophila 
melanogaster 


SD01663p 


72 


31 


617 


AAY27630 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene No. 64. 


220 


100 


618 


gil 5487240 


Homo sapiens 


mRNA for putative autophagy-related 
cysteine endopeptidase 2 (AUTL2 

gene). 


2138 


99 


61S 


gi4 176500 


Homo sapiens 


Human DNA sequence from clone 
889N15 on chromosome Xq22. 1-22.3. 
Contains part of the gene for a novel 
protein similar to X. laevis Cortical 
Thymocyte Marker CTX, the possibly 
alternatively spliced gene for 26S 
Proteasome subunit p28 (Ankyrin 
repeat protein), a novel gene and exons 
36 through 45 of the COL4A6 for 
Collagen Alpha 6(1 V). Contains ESTs, 
STSs, GSSs and a putative CpG island, 
complete sequence. 


2123 


100 


618 


gil 5487242 


Homo sapiens 


mRNA for putative autophagy-related 
cysteine endopeptidase 2, short splice 
variant { AUTL2 gene). 


1446 


73 


619 


gi2558947 


Bacillus 
subtilis 


ParC 


89 


23 


619 


gi2634193 


Bacillus 
subtilis 


DNA gyrase-like protein (subunit A) 


88 


23 


619 


gil405462 


Bacillus 
subtilis 


GrlA 


88 


23 


620 


gil2583981 


Homo sapiens 


transmembrane 6 superfamily member 
2 (TM6SF2) mRNA. partial cds. 


1386 


90 


620 


gil 2583979 


Homo sapiens 


transmembrane 6 superfamily member 
1 (TM6SF1) mRNA, complete cds. 


830 


54 


620 


AAG89336 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 456. 


828 


54 


621 


gil7384428 


Homo sapiens 


Human DNA sequence from clone 
RPl 1-100C15 on chromosome 9q34.2- 
34.3 Contains the 3' end of a novel 
gene for a protein similar to KIAA1543 
protein, the gene for a novel potassium 
channel subunit protein (KIAA1422), 
part of a novel gene, the 5' end of a 
gene for a novel lipocalin/cytosolic 


4928 


100 
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fatty-acid binding protein and CpG 
islands, complete sequence. 






621 


gil52l3360 


Homo sapiens 


clone IMAGE:3939659» mRNA, 
partial cds. 


3270 


99 


621 


gil4714974 


Homo sapiens 


clone IMAGE:3 565907, mRNA, 
partial cds. 


1 AOA 

1090 


lUU 


622 


AAB66590 


Homo sapiens 


UYBR- Human KARP-1 protem. 






622 


gi307094 


Homo sapiens 


Human Ku (p70/p80) subunit mRNA, 
complete cds. 


923 


92 


622 


gi307093 


Homo sapiens 


Human Ku autoimmune antigen gene, 
complete cds. 


923 


92 


623 


AAY73468 


Homo sapiens 


GEMY Human secreted protein clone 
ydSS 1 protein sequence SEQ ID 
NO: 158. 


601 


91 


623 


gi7292183 


Drosophila 
melanogaster 


CG12361 gene product 


75 


32 


623 


gi5911S22 


Homo sapiens 


Human DNA sequence from clone 
RP3-526I14 on chromosome 22 
Contains the BZRP gene for peripheral 
benzodiazapine receptor (PBR, ?KBS, 
mitochondrial benzodiazepine, MBR), 
the KJAAOl 53 gene, and the gene for a 
novel CUB and EGF-like domains 
containing protein. Contains ESTs, 
STSs, GSSs, genomic marker 
D22S1 179, a ca repeat polymorphism 
and a putative CpG island, conq>lete 
sequence. 


74 


33 


624 


gil5788454 


Mas musculus 


growth hormone-inducible soluble 
protein 


409 


92 


624 


gi7298358 


Drosophila 
melanogaster 


CG61 15 gene product 


215 


50 


624 


gi7529571 


Homo sapiens 


Human DNA sequence from clone 
RPl-12208 on chromosome 6ql4.2- 
16.1. Contains the 3' part of a novel 
gene partially coded for by KIAA0301 , 
a novel gene and the 3 part of the gene 
KIAA0957. Contains ESTs, STSs, 
GSSs and a putative CpG island, 
complete sequence. 


93 


34 


625 


gi9967224 


Macaca 
fascicularis 


hypothetical protein 


337 


yo 


625 


gi577220 


Saccharomyce 

a wCiCVlalaC 


Stt4p: Phosphatidylinositol-4-kinase 


68 


42 


625 


gi454207 


Saccharomyce 
s cerevisiae 


homologous protein to PI3-kinase 
(STT4) 


68 


42 


626 


gi7291693 


Drosophila 
melanogaster 


CGI 6787 gene product 


233 


36 


626 


gi4966353 


Arabidopsis 
tha liana 


ESTs gbrn6348, gb|N65615 and 
gb|Z181 19 come from this gene. 


110 


26 


626 


gil7104753 


Arabidopsis 
thaliana 


unknown protein 


99 


26 


627 


gil2856787 


Mus musculus 


putative 


785 


98 
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627 


AAG02618 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6699. 


jiy 


inn 


627 


gi4218005 


Arabidopsis 
thaliana 


putative vicilin storage protein 
(gtobulin-like) 


lUl 




628 


gi 128345 88 


Mus musculus 


putative 






628 


gi7299316 


Drosophila 
melanogaster 


CGI 28 1 6 gene product 


99 


40 


628 


AAM83343 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 10936. 


82 


34 


629 


AAB50865 


Homo sapiens 


UNIW Modified human annexin, SEQ 
ID NO: 6. 


565 


99 


629 


AAB50864 


Homo sapiens 


UNIW Modified human annexin, SEQ 
ID NO: 4. 


565 




629 


AAB50863 


Homo sapiens 


UNIW Modified human annexin, SEQ 
ID NO: 2. 


565 


99 


630 


AAB50865 


Homo sapiens 


UNIW Modified human annexin* SEQ 
ID NO: 6. 


163 


96 


630 


AAB50864 


Homo sapiens 


UNIW Modified human annexin, SEQ 
ID NO: 4. 


163 


96 


630 


AAB50863 


Homo sapiens 


UNIW Modified human annexin, SEQ 
ID NO: 2. 


163 


96 


631 


AAE04909 


Homo sapiens 


INCY- Human transporter and ion 
channel-22 rrRICH-22) protein. 


3324 


1 nn 


631 


AAB24281 


Homo sapiens 


UROG- Prostate tumour associated 
gene 24P4C1 2 protein sequence SEQ 
ID N0:2. 


3320 


on 

99 


631 


AAB93981 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 14063. 


3313 


QO 


632 


AAG81401 


Homo sapiens 


ZYMO Human AFP protein sequence 
SEQ ID NO:320. 


229 


100 


632 


AAG93300 


Homo sapiens 


NISC- Human protein HP 104 17. 


229 


100 


632 


AAG00912 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4993. 


229 


lUU 


633 


AAG89339 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 459. 


861 


mn 


633 


gil3397925 


Mus musculus 


hypothetical protein 


815 


94 


633 


gil2850449 


Mus musculus 


putative 


814 


94 


634 


AAB94808 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 15947. 


708 


100 


634 


gil0436192 


Homo sapiens 


cDNAFU13912fis. clone 
Y79AA1000230. 


708 


100 


634 


gi 1 5680 180 


Homo sapiens 


clone MGC:229!39 IMAGE:4870865, 
nuviN/\, compieic cos. 


404 


n 1 
91 


635 


gil4091315 


Mus musculus 


ADMP 


371 


85 


635 


gi 16877066 


Homo sapiens 


clone MGC:24447 IMAGE:4077762, 
mRNA, complete cds. 


173 


45 


635 


gil 6877059 


Homo sapiens 


clone MGC:24437 IMAGE:4075637, 
mRNA, complete cds. 


173 


45 


636 


gil0442725 


Homo sapiens 


pellino related intracellular signalling 
molecule (PRISM) iriRNA, complete 
cds. 


1111 


100 


636 


gil0242359 


Homo sapiens 


pellino 1 (PELIl) mRNA, complete 


2273 


100 
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cds. 






636 


gil 6741380 


Mus musculus 


pelHno (Drosophila) bomolog 1 


2268 


99 


637 


gi330178 


human 
heq}esvirus 1 


ORFl 


77 


32 


637 


AAY17406 


Homo sapiens 


UYHU- Human atrophin-1 related 
protein. 


76 


35 


637 


gi8096340 


Homo sapiens 


mRNA for RERE, complete cds. 


76 


35 


638 


AAB42962 


Homo sapiens 


CURA- Human ORFX ORF2726 
polypeptide sequence SEQ ID 
NO:5452. 


1099 


100 . 


638 


gi3342738 


Homo sapiens 


chromosome 19, cosmid R26660, 
complete sequence. 


358 


93 


638 


AAG03426 


Homo sapiens 


GEST Human secreted protein, SEQ ID 

NO: 7507. 


315 


100 


639 


AAY00293 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 36. 


645 


86 


639 


AAM23891 


Homo sapiens 


HYSE- Human EST encoded protein 
SEQIDNO: 1416. 


394 


97 


639 


AAY12138 


Homo sapiens 


GEST Human 5* EST secreted protein 
SEQ ID NO: 451. 


217 


100 


640 


gil 534 1790 


Homo sapiens 


Similar to RIKEN cDNA 2900009107 
gene, clone MGC: 17347 
IMAGE:2901027, mRNA, complete 
cds. 


1484 


100 


640 


gil2837626 


Mus musculus 


putative 


1414 


96 


640 


AAG74211 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:4975. 


400 


64 


641 


gil4017855 


Homo sapiens 


mRNA for KIAA1819 protein, partial 
cds. 


2032 


99 


641 


gil4017849 


Homo sapiens 


mRNA for KIAA1816 protein, partial 
cds. 


253 


25 


641 


gi6979930 


Homo sapiens 


Maml mRNA, partial cds. 


195 


24 


642 


gil0439l51 


Homo sapiens 


cDNA: FU22671 fis, clone HSI08712. 


1445 


100 


642 


AAE07108 


Homo sapiens 


HUMA- Human gene 3 encoded 
secreted protein fragment, SEQ ID 

NO:125. 


881 


98 


642 


AAE07053 


Homo sapiens 


HUMA- Human gene 3 encoded 
secreted protein HWHS013, SEQ ID 
NO:70. 


768 


99 


643 


AAB94047 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 14209. 


1038 


100 


643 


gil4327927 


Homo sapiens 


hypothetical protein FU 12474, clone 
MGC: 15036 IMAGE:3678268, mRNA, 
complete cds. 


1038 


100 


643 


gil0433982 


Homo sapiens 


cDNA FU12474 fis» clone 
NT2RM1000927. 


1038 


100 


644 


AAU00784 


Homo sapiens 


INCY- Human 2q)optosis protein, 
APOP-4. 


1941 


100 


644 


gil3S44020 


Homo sapiens 


Similar to RUCEN cDNA 6030457N17 
gene, clone MGC: 13096 
IMAGE:3944994. mRNA. complete 
cds. 


1941 


100 


644 


gil2833947 


Mus musculus 


putative 


1382 


69 
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645 


gi387048 


Cricetus 
cricetus 


DHFR-coamplified protein 


1037 


85 


645 


AAU19758 


Homo sapiens 


HUMA- Human novel extracellular 
matrix protein, Scq ID No 408. 


538 


100 


645 


AAU2H95 


Homo sapiens 


HUMA- Human novel foetal antigen, 
SEQID NO 1739. 


538 


100 


646 


gil6565963 


Homo sapiens 


SAM-dependent metbyltransferase 
gene, exon 1 1 and complete cds; and 
SAM-dependent methyltransferase 
gene, complete cds, alternatively 
spliced. 


1076 


90 


646 


gil5342055 


Homo sapiens 


hypothetical protein MGC2454, clone 
MGC:4132 IMAGE:2961526, mRNA, 
complete cds. 


1076 


90 


646 


gil3278783 


Homo sapiens 


clone MGC:2454 IMAGE:2961526, 
mRNA, complete cds. 


1076 


90 


647 


AAG03651 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7732. 


199 


76 


647 


gi8927662 


Unknown 


Contains similarity to extcnsin (atExtl) 
from Arabidopsis thaliana gb|U43627 
and is rich 


84 


39 


647 


gi7294152 


Drosophila 
melanogaster 


CGI 3048 gene product 


83 


41 


648 


AAY12550 


Homo sapiens 


GEST Human 5* EST secreted protein 
SEQ ID NO: 215 from WO 9906553. 


163 


100 


648 


gi9759124 


Aiabidopsis 
thaliana 


salt-inducible protein-like 


66 


37 


648 


gil5237345 


Arabidopsis 
thaliana] > 
[Arabidopsis 
thaliana 


salt-inducible protein-like 


66 


37 


649 


giI262852 


Mus musculus 


Ml 7 protein 


413 


55 


649 


gil3874586 


Macaca 
fascicularis 


hypothetical protein 


150 


34 


649 


gil5150696 


Caenorhabditis 
elegans 


Hypothetical protein Y55B1BR.3 


80 


32 


650 


gil2862482 


Homo sapiens 


ALS2CR3 mRNA for amyotrophic 
lateral sclerosis 2, candidate 3, 
complete cds. 


2969 


99 


650 


gil2862664 


Homo sapiens 


ALS2CR3 gene for amyotrophic lateral 
sclerosis 2, candidate 3, exon 16 and 

conqjlete cds. 


2963 


99 


650 


AAY92241 


Homo sapiens 


LUDW- Human cancer associated 
antigen precursor (MO-REN-4o). 


2962 


99 


651 


gil4043592 


Homo sapiens 


hypotherical protein FLJ13154, clone 
MGC:13154 IMAGE:4302289, mRNA. 
complete cds. 


1401 


100 


651 


gil3623389 


Homo sapiens 


hypothetical protein FU13154, clone 
MGC:10683 IMAGE:4025993, mRNA, 
complete cds. 


1401 


100 


651 


gil3325194 


Homo sapiens 


hypothetical protein FLJ13154, clone 
MGC:11014 IMAGE:3641317. mRNA. 
complete cds. 


1401 


100 
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652 


gi 1284 1092 


Mus musculus 


putative 


1442 


90 


652 


AAB43804 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1 249. 


531 


85 


652 


gi466475 


Gcobacillus 

stearotfiermop 

hilus 


putative phospho-beta-glucosidase 


261 


33 


653 


gil6550394 


Homo sapiens 


cDNA FLJ31056fis, clone 
HSYRA2000760. 


1412 


99 


653 


gil 6648324 


Drosophila 
melanogaster 


LD29159p 


265 


42 


6S3 


gi7295644 


Drosophila 
melanogaster 


CGI 461 3 gene product 


265 


42 


654 


AAY53056 


Homo sapiens 


GEMY Human secreted protein clone 
my340 1 protein sequence SEQ ID 
NO: 11 8. 


479 


100 


655 


gi7293719 


Drosophila 
melanogaster 


CGI 41 82 gene product 


480 


51 


655 


gil6648454 


Drosophila 
melanogaster 


SD01285p 


79 


22 


655 


gi7291881 


Drosophila 
melanogaster 


CG3770 gene product 


79 


22 


656 


gil 5 146320 


Arabidopsis 
thaliana 


At2g27260/F12K2.16 


79 


34 


656 


gil3272403 


Arabidopsis 
thaliana 


unknown protein 


79 


34 


656 


gi3608135 


Arabidopsis 
dialiana 


putative G-box-binding bZIP 
transcription factor 


74 


26 


657 


gil0439656 


Homo sapiens 


cDNA: FU23082 fis. clone 
LNG06451. 


1960 


99 


657 


AAB95383 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:17715. 


1222 


100 


657 


gil0435167 


Homo sapiens 


cDNA FU13231 fis. clone 
OVARC1000145. 


1222 


100 


658 


gil7046389 


Homo sapiens 


C21orf70 isofoma B protein (C2 1 orf70) 
mRNA, complete cds, alternatively 
spliced. 


606 


100 


658 


gil7046387 


Homo sapiens 


C21orf70 isoform A protein (C21orf70) 
mRNA, conq)Ietc cds, alternatively 
spliced. 


606 


100 


658 


gil4424633 


Homo sapiens 


clone MGC: 16722 IMAGE:4 128732, 
mRNA, complete cds. 


606 


100 


659 


AAO09511 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 23403. 


98 


38 


659 


AAO09309 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 23201. 


92 


56 


659 


gi220579 


Mus musculus 


open reading frame (196 A A) 


88 


57 


660 


AAB94146 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 14423. 


2585 


100 


660 


gil3325430 


Homo sapiens 


hypothetical protein FU 12584, clone 
MGC:11212 IMAGE: 3 929097, mRNA, 
complete cds. 


2585 


100 


660 


gil0434160 


Homo sapiens 


CDNAFU12584 fis, clone 
NT2RM4001187. 


2585 


100 
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661 


AAY59708 


Homo sapiens 


GEST Secreted protein 76-20-4-Cl l- 
FLl. 


196 


95 


661 


AAB43261 


Homo sapiens 


CURA- Human ORf X ORF3025 
polypeptide sequence SEQ ID 

NO:6050. 


184 


97 


661 


gil5451283 


Macaca 
fascicularis 


hypothetical protein 


179 


97 


662 


gi 12834045 


Mus musculus 


putative 


309 


57 


662 


AAM79478 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3124. 


306 


52 


662 


AAM78494 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1156. 


306 


52 


663 


AAB87406 


Homo sapiens 


HUM A- Human gene 32 encoded 
secreted protein HELHN47, SEQ ID 
NO:147. 


1862 


91 


663 


AAY86456 


Homo sapiens 


HUMA- Human gene 46-encoded 
jDrotein fraRment, SEO ID NO:371. 


1862 


91 


663 


AAY86260 


Homo sapiens 


HUMA- Human secreted protein 
HELHN47, SEQ ID NO: 175. 


1862 


91 


664 


AAW75222 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 27 clone H2MBT68. 


208 


100 


664 


gi3874864 


Caenoihabditis 
e]egans 


C38C6.4 


70 


36 


664 


gi7497178 


Caenorhabditis 
elegans 


hypothetical protein C38C6.4 - 
Caenorhabditis elegans > 


70 


36 


665 


gi9929941 


Macaca 
fascicularis 


hypothetical protein 


486 


89 


665 


AAM99916 


Homo sapiens 


HUMA- Human polypeptide SEQ ED 

NO 32. 


70 


36 


665 


gi9929941 


Macaca 
fascicularis 


hypothetical protein 


486 


89 


666 


gil0438496 


Homo sapiens 


cDNA: FU22202 fis, clone 
HRC01333. 


915 


100 


666 


gi 1946267 


Oryza sativa 


myb 


80 


31 


666 


AAB64815 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 43 SEQ ID 
NO:101. 


79 


30 


667 


AAG03788 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7869. 


113 


34 


667 


AAM24321 


Homo sapiens 


HYSE- Human EST encoded protein 
SEQ ID NO: 1846. 


107 


56 


667 


AAY65066 


Homo sapiens 


GEST Human 5* EST related 
polypeptide SEO ID NO: 1227. 


88 


50 


668 


gill6ll585 


Macaca 
fascicularis 


hypohtetical protein 


1 fyo 


on 
vu 


668 


gil2698180 


Macaca 
fascicularis 


hypothetical protein 


1789 


89 


668 


gil 3279047 


Homo sapiens 


clone MGC: 10761 IMAGE:3606108, 
mRNA. con^lctc cds. 


1446 


100 


669 


gi7417266 


Homo sapiens 


chromosome X map Xpl 1 .23 L-type 
calcium channel alpha- 1 subunit 
(CACNAIF) gene, complete cds; 
HSP27 pseudogene, complete 


4039 


99 
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sequence; and JMl protein, JM2 
protein, and Hb2E genes, complete cds. 






669 


gil 3559955 


Mus musculus 


DXInix48e protem 


3034 


79 


669 


gil65693 


Oryctolagus 
cuniculus 


protein phosphatase regulatory subunit 


220 


28 


670 


AAB43283 


Homo sapiens 


CURA- Human ORFX ORF3047 
polypeptide sequence SEQ ID 
NO:6094. 


715 


100 


670 


gi 14250579 


Homo sapiens 


hypothetical protein PP1628, clone 
MGC:3072 1MAGE:3346334, mRNA, 

con^lete cds. 


715 


100 


670 


gil0441903 


Homo sapiens 


clone PP1628 unknown mRNA. 


715 


100 


671 


gil5082451 


Homo sapiens 


clone MGC:20253 IMAGE:4647654, 
mRNA, complete cds. 


1107 


98 


671 


AAB98620 


Homo sapiens 


SHAN- Human vacuolar H'M--ATPasc 
C subunit 42. 


1105 


98 


671 


gil3277864 


Mus musculus 


Similar to ATPase, H+ transporting, 
lysosomal (vacuolar proton pun^) 
42kD 


1016 


90 


672 


AAB73533 


Homo sapiens 


INCY- Human transferase HTFS-4D, 
SEQ ID NO:40. 


150 


96 


672 


AAM40557 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 5488. 


150 


96 


672 


AAM38771 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 1916. 


150 


96 


673 


AAE 12563 


Homo sapiens 


ISIS- Human CITEDX (HCITEDX) 
protein. 


994 


100 


673 


gil4495276 


Homo sapiens 


MRG2 gene, complete cds. 


994 


100 


673 


gi5002200 


Mus musculus 


msgl -related protein 2 


712 


77 


674 


gi4590448 


Leishmania 
braziliensis 


L6 ribosomal protein 


80 


34 


674 


AAY30681 


Homo sapiens 


GENO- Splice variant ZAP-IB protein 
of the human tumor suppressor gene 
ZAP-1. 


71 


60 


674 


AAY30680 


Homo sapiens 


GENO- Splice variant ZAP-l A protein 
of the human tumor suppressor gene 
ZAP-1. 


71 


60 


675 


gi995537 


Homo sapiens 


H.sapiens gp70 region of endogenous 
retrovirus erv-4. 


707 


100 


675 


gi995542 


Homo sapiens 


H.sapiens gp70 region of endogenous 
retrovirus erv-6. 


698 


99 


675 


gi995529 


Homo sapiens 


H.sapiens gp70 region of endogenous 
retrovirus crv-16. 


690 


97 


676 


gil3816301 


Sulfolobus 
solfataricus 


Second ORF m transposon ISC1234 


86 


45 


676 


gil381S862 


Sulfolobus 
sol&taricus 


Transposase ISC1234 


86 


45 


676 


gil707705 


Sulfolobus 
solfataricus 


orfc06026 


86 


45 


677 


gi6470334 


Homo sapiens 


protein translocase, JM26 protein, 
UDP-galactose iranslocator, pim-2 
protooncogene homolog pim-2h, and 
shal-type potassium channel genes, 


914 


100 
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complete cds; JM12 protein and 
transcription factor IGHM enhancer 3 
genes, partial cds; and unknown gene, 
complete sequence. 






677 


gi3258629 


Homo sapiens 


inner mitochondrial membrane 
translocase Timl7b mRNA, nuclear 
gene encoding mitochondrial protein, 
complete cds. 


914 


100 


677 


gi3 114824 


Homo sapiens 


mRNA for (JM3) preprotein 
translocase, con^ietc CDS (clone 
IMAGE 345224 and 
LLOXNC01U138D3 (Baylor 
College)). 


914 


100 


678 


gi6470334 


Homo sapiens 


protein translocase, JM26 protein, 
UDP-galactose translocator, pim-2 
protooncogene homolog pim-2h, and 
shal-type potassium channel genes, 
complete cds; JMt2 protein and 
transcription factor IGHM enhancer 3 
genes, partial cds; and unknown gene, 
complete sequence. 


852 


77 


678 


gi3258629 


Homo sapiens 


inner mitochondrial membrane 
translocase Tim 1 7b mRNA, nuclear 
gene encoding mitochondrial protein, 
complete cds. 


852 


77 


678 


gi3n4824 


Homo sapiens 


mRNA for (JM3) preprotein 
translocase, complete CDS (clone 
IMAGE 345224 and 
LLOXNC01U138D3 (Baylor 
College)). 


852 


77 


679 


AAB95758 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:18678. 


685 


100 


679 


gil4042475 


Homo sapiens 


cDNA FU14739 fis, clone 
NT2RP3002402. 


685 


100 


679 


AAG02020 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6101. 


480 


98 


680 


AAY48565 


Homo sapiens 


MET A> Human breast tumour- 
associated protein 26. 


336 


96 


680 


gi9967248 


Macaca 
fascicularis 


hypothetical protein 


318 


88 


680 


gi3834384 


Homo sapiens 


nuclear localization signal containing 
protein deleted in Velo-Cardio-Facial 
syndrome (Nlvcf) mRNA, complete 
cds. 


66 


32 


681 


gil0437387 


Homo sapiens 


cDNA: FLJ21308 fis. clone 
COL02131. 


2600 


99 


681 


AAG73603 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:4367. 


2016 


100 


681 


gi6102903 


Homo sapiens 


mRNA; cDNA DKFZp566D244 (from 
clone DICF^566D244); partial cds. 


1492 


68 


682 


AAO09836 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 23728. 


265 


100 


682 


AAU39010 


Homo sapiens 


GEMY Human secreted protein 


265 


100 
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bG77 1. 






682 


gil 695241 


Caenorhabditis 
elegans 


Hypothetical protein F20D6.8 


67 


43 


683 


AAG03386 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7467. 


343 


98 


683 


gil6504195 


Salmonella 
enterica subsp. 
enterica 
serovar Typhi 


hypothetical protein 


78 


28 


683 


gil2328S92 


Heterodoxus 
maciopus 


cytochrome b 


66 


37 


684 


gil4250495 


Homo sapiens 


Similar to RIKEN cDNA 0610006H10 
gene, clone MGC:9740 
IMAGE:3853707. mRNA, complete 
cds. 


1677 


100 


684 


gil5489134 


Homo sapiens 


RIKEN cDNA 0610006H10 gene, 
clone MGC: 17267 1MAGE:4 155233, 
mRNA, conqslete cds. 


1159 


69 


684 


gil4789807 


Mus musculus 


RIKEN cDNA 0610006H10 gene 


1159 


69 


685 


AAG73989 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:4753. 


717 


100 


685 


AAB58998 


Homo sapiens 


HUMA- Breast and ovarian cancer 

associated antigen protein sequence 
SEQ ID 706. 


717 


100 


685 


AAM89100 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 
ID NO: 16693. 


247 


61 


686 


AAY04295 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 3. 


478 


97 


686 


gi2 11447 


Gallus gallus 


receptor tyrosine kinase 


75 


35 


686 


gil749624 


Schizosacchar 
omyces pombe 


similar to Saccharomyces cerevisiae 
hypothetical 48.0KD protein in 
CDC28-ARL1 intergenic region 
precursor, SWISS-PROT Accession 
Number P38288 


69 


43 


687 


AAY02726 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 77 clone HE2EC79. 


158 


too 


688 


gi9967194 


Macaca 
fascicularis 


hypothetical protein 


269 


94 


688 


gi9948233 


Pseudomonas 
aemginosa 


probable MPS transporter 


69 


43 


688 


gil 5026548 


Clostridium 

acetobutylicu 

m 


Predicted membrane protein 


68 


32 


689 


AAY02923 


Homo sapiens 


HUMA- Fragment of human secreted 
protein encoded by gene 99. 


235 


100 


690 


AAG73811 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:4575. 


1099 


96 


690 


gil 7028339 


Homo sapiens 


clone MGC: 10198 IMAGE:3909581, 
mRNA, complete cds. 


966 


99 


690 


gil 6740631 


Mus musculus 


Unknown (protein for MGC:27606) 


900 


90 


691 


AAG02438 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6519. 


360 


100 
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692 


gil65539l4 


Homo sapiens 


cDNA FU25202 fis, clone REC05350, 


2486 


87 


692 


gil3445910 


Homo sapiens 


radial spoke protein 3 (RSP3) mRNA, 
complete cds. 


1771 


86 


692 


gil6553419 


Homo sapiens 


cDNAFU33093fis. clone 
TRACH2000675, weakly sinular to 
RADIAL SPOKE PROTEIN 3. 


1566 


88 


693 


gil6553914 


Homo sapiens 


cDNA FU25202 fis. clone REC05350. 


2921 


99 


693 


gil3445910 


Homo sapiens 


radial spoke protein 3 (RSP3) mRNA, 
complete cds. 


2144 


100 


693 


gil3874516 


Macaca 
fascicularis 


hypothetical protein 


1799 


94 


694 


AAY13135 


Homo sapiens 


GEST Human secreted protein encoded 
by 5' EST SEQ ID NO: 149. 


355 


100 


694 


gil6420959 


Salmonella 

typhimuhum 

LT2 


regulator for XapA (LysR family) 


74 


35 


694 


gil6503639 


Salmonella 
enterica subsp. 
enterica 
serovar Typhi 


xanthdsine operon transcriptional 
regulator 


74 


35 


695 


AAG00152 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4233. 


198 


100 


695 


gil4022310 


Mesorhizobiu 
mloti 


hypothetical protein 


66 


46 


696 


gi4959568 


Homo sapiens 


nuclear pore conplex interacting 
protein NPIP (NPIP) mRNA, complete 
cds. 


1742 


99 


696 


gi2342743 


Homo sapiens 


Human Chromosome 16 BAG clone 
CIT987SK.A-589HU coiiqjlete 
sequence. 


1724 


98 


696 


AAY10915 


Homo sapiens 


HUMA- Amino acid sequence of a 
human secreted peptide. 


865 


98 


697 


gi4959568 


Homo sapiens 


nuclear pore complex interacting 
protein NPIP (NPIP) mRNA, complete 
cds. 


1583 


87 


697 


gi2342743 


Homo sapiens 


Human Chromosome 16 BAG clone 
CIT987SK*A-589H1, coiiq)lete 
sequence. 


1565 


87 


697 


gi3337385 


Homo sapiens 


Chromosome 16 BAG clone 
CIT987SK-A-761H5, conqjlctc 
sequence. 


886 


63 


698 


gi4959568 


Homo sapiens 


nuclear pore complex interacting 
protein NPIP (NPIP) mRNA, con^)lete 
cds. 


1586 


92 


698 


gi2342743 


Homo sapiens 


Human Chromosome 16 BAG clone 
CIT987SK-A-589H1, conq)lete 
sequence. 


1573 


91 


698 


AAY10915 


Homo sapiens 


HUMA- Amino acid sequence of a 
human secreted peptide. 


865 


98 


699 


gi4959568 


Homo sapiens 


nuclear pore complex interacting 
protein NPIP (NPIP) mRNA, complete 
cds. 


1503 


88 


699 


gi2342743 


Homo sapiens 


Human Chromosome 16 BAG clone 


1485 


87 
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CI 1 v<3 /bK-A-3oyri I , compieie 
sequence. 






699 


gi3337385 


Homo sapiens 


Chromosonie i o u clone 
CIT987SK-A-761H5, complete 
sequence. 


871 
6 / L 


Do 


700 


gil7389867 


Homo sapiens 


Similar to protein phosphatase 1, 
regulatory (inhibitor) subunit 1 A, clone 

M(jC:24vWl IMAUD.^AOoyiy, niKJMA, 

complete cds. 


572 


100 


700 


fiil0198117 


Mus musculus 


protein phosphatase inhibitor- 1 


226 


49 


700 


gi7271433 


Rattus 
norvegicus 


protein phosphatase inhibitor- 1 


223 


48 


701 


gil710282 


Homo sapiens 


Human clone 23803 miRNA, partial 
cds. 


1899 


100 


701 


gil5215400 


Homo sapiens 


hypothetical protein MGC4675, clone 
MGC:2450 IMAuh: 2961 135, mKNA, 
complete cds. 


458 


37 


701 


gil3278936 


Homo sapiens 


Similar to RIKBN cDNA 
5430432M24 gene, clone MGC:4675 
IMAG£:3332odO, mRMA, coinpiete 

cds. 


4 JO 


n 


702 


AAB93771 


Homo sapiens 


HJbLl- Human protem sequence ocy 
ID NO: 13481. 


1 1 i\l 
llVf 


inn 


702 


gil0432902 


Homo sapiens 


CJDNA rUl loUo tis. Clone 
HEMBA1003976. 




inn 


702 


gi6599138 


Homo sapiens 


mRNA; cDNA DKFZp434I036 (from 
clone DKrZp434I(J3o}; partial cos. 


86 


23 


703 


AAW89046 


Homo sapiens 


HUMA« Polypeptide fragment encoded 
by gene 182. 


196 


100 


703 


gi23l3995 


Helicobacter 
pylori 26695 


lipid A disaccharide synthetase (IpxB) 


74 


30 


703 


gi415535I 


Helicobacter 
pylori J99 


LIPID-A-DISACCHARIDE 
SYNTHASE 


68 


37 


704 


gil5930206 


Homo sapiens 


hypothetical protein FLJ 12806, clone 
MGC:95lo IMACjt:39U357i/, mKiMA, 
complete cds. 


1583 


99 


704 


AAB94314 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 14787. 


1576 


99 


704 


giia434510 


Homo sapiens 


cDNA FU12806 fis, clone 
NT2RP2002235. 


1576 


99 


705 


AAY64818 


Homo sapiens 


GEST Human 5' EST related 
polypeptide SEQ ID NO:979. 


429 


97 


705 


gi3913990 


Mycobacteriu 

m siueginaus 


ATP-DEPENDENT PROTEASE LA > 


66 


37 


705 


gil22240 


Rattus 
norvegicus 


RTl CLASS n 

HISTOCOMPATIBILITY ANTIGEN, 
A BETA CHAIN > 


66 


28 


706 


AAB9S004 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 16665. 


664 


99 


706 


gil0433328 


Homo sapiens 


cDNA FUl 1952 fis, clone 
HEMBB1000831, weakly similar to 
Homo sapiens breast cancer nuclear 
receptor-bindine auxiliary protein 


664 


99 
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(BRX) mRNA. 






706 


gil0803146 


Streptomyces 
coelicolor 


putative regulatory protein 


88 


42 


707 


AAG74480 


Homo sapiens 


HUMA' Human colon cancer antigen 
protein SEQ ID NO:5244. 


2371 


99 


707 


AAB53417 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein sequence SEQ ID NO:957. 


2371 


99 


707 


gil5489153 


Homo sapiens 


hypothetical protein FUl 1896, clone 
MGC:16887 IMAGE:3858181. mRNA, 
complete cds. 


1729 


100 


708 


gil 2862476 


Homo sapiens 


SIMPLE mRNA for small integral 
membrane protein of lysosome/late 
endosome, complete cds. 


903 


99 


708 


gn7391332 


Mus musculus 


LPS-induced TNF-alpha factor 


813 


86 


708 


gi6739573 


Mus musculus 


TBXl protein 


813 


86 


709 


AAG03860 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7941. 


425 


72 


709 


gi337508 


Homo sapiens 


Human ribosomal protein S25 mRNA, 
complete cds. 


425 


72 


709 


gil3436422 


Homo sapiens 


ribosomal protein S25, clone 
MGC:421 1 I^4AGE:2905996. mRNA, 
complete cds. 


425 


72 


710 


AAB63957 


Homo sapiens 


LUDW- Human prostate cancer 
associated antigen protein sequence 
SEQ IDNO:1319. 


696 


100 


710 


gil 5082563 


Homo sapiens 


clone MGC:20481 IMAGE:4644158, 
mRNA, complete cds. 


696 


100 


710 


gil2804525 


Homo sapiens 


clone IMAGE:2823236, mRNA, 
partial cds. 


696 


100 


711 


gil 3929452 


Homo sapiens 


Human DNA sequence from clone 
RP3-337018 on chromosome 20ql2. 
13. 1 . Contains the PLPT gene encoding 
Phospholipid Transfer Protein, the 
PPGB gene coding for Lysosomal 
Protective Protein precursor (EC 
3.4.16.5, CathepsinA, 
Carboxypeptidase C) and the gene 
encoding peroxisomal acyl-CoA 
thioesterase (PTEl, thioesterase II), 
four novel genes, the gene for a novel 
protein similar to Drosophila 
Neuralized (Neu) and the 5* end of an 
isoform of the TNNC2 gene for fast 
troponin C2. Contains three CpG 
islands, ESTs. STSs and GSSs, 
complete sequence. 


3655 


100 


711 


gi 16552 100 


Homo sapiens 


cDNA FLJ32079 fis, clone 
OCBBF2000013. 


3645 


99 


711 


AAM70804 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 31110. 


936 


100 


712 


AAY60350 


Homo sapiens 


M£TA> Human normal bladder tissue 
EST encoded protein 22. 


247 


90 
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712 


gil38S3230 


Mycobacteriu 
m tuberculosis 
CDC 1551 


hydrolase, Ama/HipO/HyuC family 


65 


44 


712 


gi28942l5 


Mycobacteria 
m tuberculosis 
H37RV 


amiB 


65 


44 


'713 


AAY94970 


Homo sapiens 


GEMY Human secreted protein clone 
dm365 3 protein sequence SEQ ID 
NO: 146. 


523 


100 


713 


gilS161741 


Agrobacterium 
tumefaciens 
str. €58 
(Cereon) 


AGRj)AT_14p 


70 


41 


713 


gil7743430 


Agrobacterium 
tumefaciens 
str. C58 
(Dupont) 


conserved hypothetical protein 


70 


41 


714 


AAG93310 


Homo sapiens 


NISC- Human protein HP10561 . 


1124 


97 


714 


ftil2858071 


Mus musculus 


putative 


819 


73 


714 


gil2751094 


Homo sapiens 


PNAS-124 mRNA, complete cds. 


667 


99 


715 


AAM78541 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1203, 


788 


86 


715 


gil 5080755 


Homo sapiens 


ribonuclcase P subunit (RPP21) 
mRNA, complete cds. 


788 


86 


715 


Bil0439106 


Homo sapiens 


cDNA: FU22638 fis, clone HSI06727. 


788 


86 


716 


fiil2849817 


Mus musculus 


putative 


679 


83 


716 


AAY57925 


Homo sapiens 


INCY- Human transmembrane protein 
HTMPN-49. 


670 


100 


716 


gi4926831 


Arabidopsis 
thaliana 


T17H7.16 


111 


30 


717 


gi9885192 


Homo sapiens 


Human DNA sequence from clone 
RP5-88IL22 on chromosome 20 
Contains ESTs, GSSs, STSs and CpG 
islands. Contains a gene for a novel 
protein similar to a trypsin inhibitor and 
four other genes for novel proteins, 
complete sequence. 


1939 


100 


717 


gil 401 7764 


Mus musculus 


CG10671.Iike 


348 


35 


717 


gil4017773 


Mus musculus 


Cgl0671-like 


348 


35 


718 


gi7959173 


Homo sapiens 


mRNA for KIAA1456 protein, partial 
cds. 


1942 


99 


718 


gil6741666 


Homo sapiens 


clone MGC: 16945 IMAGE:3867327, 
mRNA, complete cds. 


1942 


99 


no 

718 


gi/j(;i4ij 


Urosopiiila 
melanogaster 


Cuo9oo gene proauct 




<o 
jy 


719 


AAG75423 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:6187. 


994 


98 


719 


AAB53454 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein sequence SEQ ID NO:994. 


994 


98 


719 


gil2839939 


Mus musculus 


putative 


801 


92 


720 


gil4582152 


Xcnopus 
laevis 


maxi'K potassium channel alpha 
subunit Slo 


151 


100 


720 


fii5577974 


Trachemys 


calcium-activated potassium channel 


151 


100 



153 



wo 02/074961 



PCT/US02/05109 



Table 2 



SEQ ID NO: 


Accession No. 


Species 


DescripUoo 


Score 


% 
laenuiy 






scripta 


isoform thc7 






720 


gi2072759 


uaUus gailus 


calcium-activated potassium channel 


1 CI 




721 


AAG03177 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7258. 


244 


100 


722 


AAG78876 


Homo sapiens 


SHAN- Human zinc finger protein 36. 


1749 


100 


722 


gil2804829 


Homo sapiens 


clone MGC:4707 1MAGE:353454U 
mRNA, complete cds. 


1749 


100 


722 


gil0438507 


Homo sapiens 


cDNA: FU22210 fis, clone 
HRC015G3. 


1744 


99 


723 


AAO03397 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 17289. 


364 


89 


723 


gil0697002 


Homo sapiens 


Human DNA sequence from clone 
RP1M08E5 on chromosome 13qll- 
12.2 Contains an FSH primary respone 
homolog 1 (FSHPRHl) pscudogenc, 
two genes for novel proteins, a gene for 
an orthologue of mouse tubulin alpha 3 
(TUB A3) or 7 CrUBA7) and a gene for 
a novel protein similar to DMPK-likc 
CDC42-binding protein kinase beta 
(CDC42BPB). Contains ESTs, STSs 
and GSSs, complete sequence. 


330 


84 


723 


AAB42069 


Homo sapiens 


CURA- Human ORFX ORF1833 
polypeptide sequence SEQ ID 
NO:3666. 


282 


75 


724 


gil045612 


Human 

endogenous 

retrovirus 


pol polyprotein 


242 


71 


724 


AAO03158 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 17050. 


138 


41 


724 


AAM41750 


Homo sapims 


HYSE- Human polypeptide SEQ ID 
NO 6681. 


134 


37 


725 


AAW88411 


Homo sapiens 


UYMA- Acute myeloid leukaemia 
nuclear matrix associated protein 
AML-IB. 


103 


100 


725 


gi966999 


Homo sapiens 


Human AMLl mRNA for AMLlc 

protein (alternatively spliced product), 
con^lete cds. 


103 


100 


725 


gi3 153104 


Homo sapiens 


959 kb contig between AMLl and 
CBRl on chromosome 21q22, segment 
3/3. 


103 


100 


726 


gil0437131 


Homo sapiens 


cDNA: FU2 1106 fis, clone 
CAS05176. 


1268 


99 


726 


gi7294550 


Drosophila 
melanogaster 


CG10982 gene product 


294 


40 


726 


gi3875258 


Caenorhabditis 
elegans 


wack similarty with bacillus 
amyloliquefaciens permease IIBC 
(Swiss Prot accession number 
P41029HDNA ESTyk573h3.3 comes 
from this gcnc-^DNA EST yk573h3.5 
comes from this gene-cDNA EST 
EMBL:AU1 09975 comes from this 
gene-cDNA EST EMBLrAUl 10906 


201 


46 
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comes from this gene-cDNA EST 
EMBL:AU1 12278 comes from this 
gene-cDNA EST EMBL:AU1 10642 
comes from this gene-<;DNA EST 
EMBL:AU114810 comes from this 
gene-cDNA EST £MBL:AU1 14566 
comes from this gene-cDNA EST 
EMBL: AU 116117 comes from this 
gcne-cDNA EST EMBL: AUl 13930 
comes from this gene 






727 


AAB97828 


Homo sapiens 


PFIZ Human G protein-coupled 
receptor PFI-014 protein sequence SEQ 
IDN0:2. 


195 


54 


727 


AAE06763 


Homo sapiens 


INCY- Human G-protein coupled 
receptor-13 (GCREC-13) protein. 


174 


45 


727 


gil3384175 


Homo sapiens 


FKSG46 


166 


44 


728 


AAG02577 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6658. 


263 


98 


728 


ABB12137 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO:2507. 


261 


100 


728 


gil4334860 


Arabidopsis 
thaliana 


putative ATP-dependent Clp protease 
regulatory subunit CLPX 


78 


39 


729 


AAG03340 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7421, 


230 


97 


729 


gil5075752 


Sinorhizobium 
meliloti 


PROBABLE ADENYLOSUCCINATE 
SYNTHETASE IMP-ASPARTATE 
LIGASE PROTEIN 


64 


34 


730 


AAG02081 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6162. 


565 


99 


730 


AAB65702 


Homo sapiens 


SUGE- Novel protein kinase, SEQ ID 
NO: 231. 


80 


26 


730 


gi 15289906 


Oryza sativa 


hypothetical protein 


72 


29 


731 


gil 6549183 


Homo sapiens 


cDNA FU30046 fis, clone 
3NB692001719. 


1593 


100 


731 


ABB11357 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO:1727. 


1470 


93 


731 


AAG00669 


Homo sapiens 


GEST Human secreted protein. SEQ ID 
NO: 4750. 


600 


100 


732 


gil 1611585 


Macaca 
fasciculahs 


hypohtetical protein 


2151 


90 


732 


giI2698180 


Macaca 
fascicularis 


hypothetical protein 


2142 


90 


732 


gil 3279047 


Homo sapiens 


clone MGC:10761 IMAGE:3606108, 
mRNA, complete cds. 


1446 


100 


733 


AAG03184 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7265. 


254 


84 


734 


AAB36365 


Homo sapiens 


AS AH Human TRAF6 binding protein 
(T6BP)SEQIDN0:1. 


2317 


99 


734 


gil3435951 


Mus musculus 


Similar to TAKl -binding protein 2; 
K1AA0733 protein 


610 


32 


734 


AAG64616 


Homo sapiens 


MATS/ Human TAB2 amino acid 
sequence. 


600 


32 


735 


gi9988100 


Homo sapiens 


Human DNA sequence from clone 


562 


100 
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RP3-467N11 on chromosome 6q 16.1 - 
16.3 Contains part of a gene for a novel 
protein. Contains GSSs, STSs, ESTs 
and a CpG island, complete sequence. 






735 


gi 1 322280 


Mus musculus 


unconventional myosin VI 


78 


24 


735 


gil232l496 


Arabidopsis 
thaliana 


hypothetical protein 


75 


25 


736 


gil0437991 


Homo sapiens 


cDNA: FU21816 fis, clone HEPOl 1 16. 


2205 


100 


736 


gi3253105 


Caenorhabditis 
elegans 


Hypothetical protein B0041.7 


88 


22 


736 


gi5901659 


Caenorhabditis 
elegans 


XNP-1 


88 


22 


737 


AAB36671 


HoRu> sapiens 


TAKE Human secretory protein TGC- 
715SEQIDNO:n. 


406 


100 


737 


AAU12423 


Homo sapiens 


GETH Human PRO 1 273 polypeptide 
sequence. 


406 


100 


737 


AAM94192 


Homo sapiens 


HUMA- Human reproductive system 
related antigen SEQ ID NO: 2850. 


406 


100 


738 


fiil2856120 


Mus musculus 


putative 


781 


91 


738 


gi7292255 


Drosophila 
melanogaster 


CGI 6984 gene product 


229 


33 


738 


Ril61290 


Loligo pealei 


kinesin heavy chain 


101 


31 


739 


gil0439252 


Homo sapiens 


cDNA: FU22746 fis. clone 
HUV01174. 


1284 


99 


739 


gil6549966 


Homo sapiens 


cDNA FLJ30707 fis, clone 
FCBBF200121L 


562 


41 


739 


gil3376148 


Homo sapiens 


hypothetical protein FLJ22746 


1284 


99 


740 


AAY86331 


Homo sapiens 


HUMA- Human secreted protein 
HLDCE79, SEO ID NO:246. 


179 


100 


741 


AAB70489 


Homo sapiens 


SREN- Human hHAIERbs-iso protein 
sequence SEQ ID N0:7. 


1116 


91 


741 


AAM25809 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO: 1324. 


1116 


91 


741 


ABB11989 


Homo sapiens 


HYSE- Human secreted protein 
homoloRue. SEQ ID NO:2359, 


1116 


91 


742 


AAB70489 


Homo sapiens 


SREN- Human hHAIERbs-iso protein 
sequence SEQ ID N0:7. 


835 


73 


742 


AAM25809 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO: 1324. 


835 


73 


742 


ABB 11989 


Homo sapiens 


HYSE- Human secreted protein 
homoloeue, SEO ID NO:2359. 


835 


73 


743 


AAG03428 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7509. 


389 


98 


743 


AAY59723 


Homo sapiens 


GEST Secreted protem o0-14-2-H10- 
FLl. 






743 


£il2852865 


Mus musculus 


putative 


295 


41 


744 


AAB95034 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 1 6786. 


807 


100 


744 


gil0433444 


Homo sapiens 


cDNA FUI2057 fis. clone 
HEMBB1002068. 


807 


100 


744 


gil4715075 


Mus musculus 


mitotic arrest deficient 1-like 1 


85 


27 


745 


AAY13128 


Homo sapiens 


GEST Human secreted protein encoded 
by 5' EST SEO ID NO: 142. 


632 


100 
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745 


gil284433l 


Mus tnusculus 


putative 


509 


91 


745 


AAM25781 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO: 1 296. 


411 


48 


746 


AAB43357 


Homo sapiens 


CURA- Human ORFX ORF3121 
polypeptide sequence SEQ ID 
NO:6242. 


652 


54 


746 


gil2851679 


Mus musculus 


putative 


640 


52 


746 


AAM38640 


Homo sapiens 


HUMA- Human colorectal cancer 
antigen SEQ ID NO: 155. 


615 


62 


747 


gil6552467 


Homo sapiens 


cDNA FLJ32372 fis, clone 
SALGL1000005. 


1067 


100 


747 


gil5278389 


Homo sapiens 


Similar to hypothetical protein, 
MGC:7036, clone MGC:4797 
IMAGE:3544761. mRNA, complete 
cds. 


1067 


100 


747 


gil3097090 


Mus musculus 


Unknown (protein for MGC:7036) 


750 


73 


748 


AAB64418 


Homo sapiens 


INCY- Amino acid sequence of hiunan 
intracellular signalling molecule 
INTRA50. 


248 


100 


748 


AAM43637 


Homo sapiens 


HUM A- Human polypeptide SEQ ID 
NO 315. 


248 


100 


748 


AAM43562 


Homo sapiens 


HUMA- Human polypeptide SEQ ID 
NO 240. 


248 


100 


749 


gil75 12087 


Homo sapiens 


clone IMAGE:454493 1 , mRNA, 
partial cds. 


733 


100 


749 


gil5488867 


Mus musculus 


RKEN cDNA 2210010N10 gene 


596 


77 


749 


gii3905220 


Mus musculus 


Similar to RIKEN cDNA 2210010N10 

gene 


591 


77 


750 


gi 16553 708 


Homo sapiens 


cDNA FLJ25045 fis, clone CBL03591. 


580 


76 


750 


AAB65273 


Homo sapiens 


GETH Human PR01287 (UNQ656) 
protein sequence SEQ ID N0:381. 


152 


31 


750 


AAB8756I 


Homo sapiens 


GETH Human PROl 287. 


152 


31 


751 


AAE02443 


Homo sapiens 


CHIL- Human beta-glucuronidase 
(GUS), 


290 


77 


751 


AAW93828 


Homo sapiens 


CAMB- Human GUS protein fragment. 


290 


77 


751 


AAR50O92 


Homo sapiens 


BEHW Humanised anti-CEA sFv 
fragment-human beta-glucuronidase 
fusionproteia 


290 


77 


752 


AAY54593 


Homo sapiens 


INCY- Amino acid sequence of a 
human transferase designated 
HUTRAN-3. 


2334 


100 


752 


AAB43316 


Homo sapiens 


CURA- Human ORPX ORF3080 
polypeptide sequence SEQ ID 
NO:6160. 


2334 


100 


752 


Ei5257221 


Mus musculus 


protein arginine methyltransferase 


2289 


98 


753 


AAB43316 


Homo ssqstens 


CURA- Human ORFX ORF3080 
polypeptide sequence SEQ ID 
NO:6160. 


2400 


100 


753 


gi5257221 


Mus musculus 


protein arginine methyltransferase 


2355 


98 


753 


AAY54593 


Homo sapiens 


INCY- Amino acid sequence of a 
human transferase designated 
HUTRAN-3. 


2334 


100 


754 


AAGO0395 


Homo sapiens 


GEST Human secreted protein, SEQ ID 


268 


100 
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NO: 4476. 






754 


gil4574333 


Caenorhabditis 
elegans 


Hypothetical protein Y41D4B.21 


66 


30 


755 


AAY10869 


Homo sapiens 


HUMA- Amino acid sequence of a 
human secreted protein. 


129 


68 


755 


gill70402 


Perameles 
gunnii 


SPERM PROTAMnsiE PI > 


63 


32 


756 


AAB95812 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 18806. 


1914 


100 


756 


gil2652907 


Homo sapiens 


clone MGC:2603 IMAGE:3350471, 
mRNA, complete cds. 


1914 


100 


756 


gil0436683 


Homo sapiens 


cDNA FLJ14264 fis, clone 
PLACE1002004. 


1914 


100 


757 


gill493710 


Homo sapiens 


plO-binding protein BITE (BITE) 
mRNA, complete cds. 


3022 


99 


757 


AAB95280 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 17491. 


3014 


99 


757 


gil0434862 


Homo sapiens 


cDNAFU 13036 fis, clone 
NT2RP3001253. weakly sinular to 
NUFl PROTEIN. 


3014 


99 


758 


AAB41848 


Homo sapiens 


CURA- Human ORFX ORF1612 
polypeptide sequence SEQ ID 
NO:3224. 


559 


93 


758 


gi 12861 339 


Mus musculus 


putative 


443 


74 


758 


AAY36414 


Homo sapiens 


HUMA- Fragment of human secreted 
protein encoded by gene 7. 


442 


93 


759 


gi4I28039 


Homo sapiens 


mRNA for TL132. 


994 


99 


759 


AAM38692 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 1837. 


887 


95 


759 


AANB8691 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 1836. 


887 


95 


760 


gi 12320889 


Arabidopsis 
thaliana 


ATP-depeodent DNA helicase RecQ, 
putative 


69 


35 


760 


gi 17426897 


Arabidopsis 
thaliana 


helicase 


69 


35 


760 


AAM92379 


Homo sapiens 


HUMA- Human digestive system 
antigen SEQ ID NO: 1728. 


68 


43 


761 


AAB31473 


Homo sapiens 


ZYMO Amino acid sequence of a 
human helical cytokine designated 
Zalpha33. 


924 


100 


761 


AAG93271 


Homo sapiens 


NISC- Human protein HP10431. 


924 


100 


761 


gLl4198326 


Homo sapiens 


Similar to RIKEN cDNA 1810038N03 
gene, clone MGC:9890 
IMAGE:3868437, mRNA, complete 
cds. 


924 


100 


762 


gi9790624 


Homo sapiens 


testis-specific kinase substrate (TSKS) 
gene» complete cds. 


3062 


100 


762 


gil 1068125 


Mus musculus 


testis specific serine kinase substrate 


2084 


81 


762 


AAM95529 


Homo sapiens 


HUMA- Human reproductive system 
related antigen SEQ ID NO: 4 1 87. 


785 


85 


763 


gi6502963 


Mus musculus 


KX antigen 


944 


43 


763 


gil2841470 


Mus musculus 


putative 


944 


43 


763 


gi4883433 


Homo sapiens 


mRNA for membrane transport protein 


930 


44 
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(XKgenc). 






764 


AAB95836 


Homo sapiens 


HELI- Human protein sequence SEQ 

ID NU.Ioooj. 


6026 


99 


764 


gil0436735 


Homo sapiens 


cDNA FU14303 fis, clone 
PLACE2000132. 


6026 


99 


764 


ftil4971110 


Homo sapiens 


mucm 16 (MUC16) mRNA, partial cos. 


6023 


no 

99 


765 


gi6S07698 


Homo sapiens 


mRNA; cDNA DKFZp434A1014 
(from clone DKFZp434Al014); partial 

COS. 


308 


48 


765 


AAM77697 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 


2/6 


/4 


765 


AAM64969 


Homo sapiens 


MOLE> Human brain expressed single 
exon probe encoded protein SEQ ID 

IMU: 3 /U/4. 


278 


74 


766 


AAB95310 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:17554. 


551 


100 


766 


gil 47949 14 


Mus musculus 


capicua protein 


1 At 
101 


3z 


766 


gil 2836037 


Mus musculus 


putative 


1 A 1 
101 




767 


gi4309887 


Homo sapiens 


PAC clone RP5-1 163J12 from 7q21.2- 
q3Ll, complete sequence. 


1047 0 


99 


767 


AAM73703 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
lU NO: 34009. 


136 


100 


767 


AAM61008 


Homo sapiens 


MOLE- Human brain expressed single 
exon proDe encoaeo protein ot\l 
NO: 33113. 


136 


100 


768 


gi2664295 


Homo sapiens 


H.sapiens MDR3 gene, exonl, exon2. 


141 


100 


768 


gi307181 


Homo sapiens 


Human membrane glycoprotein P 
(mdr3) mRNA, coitqslete cds. 


136 


100 


768 


gil006663 


Homo sapiens 


H.sapiens mRNA for MDR3 P- 
glycoprotein. 


136 


100 


769 


gil2854186 


Mus musculus 


putative 


1703 


88 


769 


gi5596697 


Homo sapiens 


Novel human gene mapping to 
chomosome 22. 


818 


49 


769 


gi4493522 


Homo sapiens 


Human DNA sequence from clone 
RP3-323M22 on chromosome 22 
Contains the 5* part of the PACSIN2 
(protein kinase C and casein kinase 
substrate in neurons 2) gene and a 
novel gene coding for a protein similar 
to KIAA0173 and worm tubulin 
tyrosine ligase, genomic marker 
jLizzo4 1 o, L^A repeat, i o, a i os, 
GSSs and putative CpG islands, 
complete sequence. 


818 


49 


770 


AAB94472 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:15137. 


1284 


100 


770 


gil0434955 


Homo sapiens 


cDNA FU 13096 fis, clone 
NT2RP3002166. 


1284 


100 


770 


AAM66773 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO: 27079. 


258 


100 
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771 


AAd93902 


Hoino sapiens 


ncLri" nuiiiaii piuicin oCL|udivu ouv^ 
IDNO:13857. 


892 


95 


771 




Homo ssipiens 


CLIINA FLJ 1 Z IH / LIS, ClUuC 


892 


95 


771 


AAG03840 


Homo sapiens 


GEST Human secreted protein, SEQ ID 


89 


56 


772 


gil3325313 


Homo sapiens 


Similar to RIKEN cDNA 1500005N04 
gene. Clone ivi\j\^. ii/j^ j 
IMAGE:3936182, mRNA, complete 
cds. 


678 


100 


772 


A A^^^>l^^OA 
AAU74UyU 


Homo sapiens 


ITTfV/f A Human /*n1/\n osncpr sifitiO^n 
nUlVL/\« nUIUall C<U1UII i-aliuci ollllgcu 

protein SEQ ID NO:4854. 


500 


97 


772 


gll2c3713D 


Mus musculus 


putative 


487 


75 


773 


AAB68986 


Homo sapiens 


UYJO Human polyamine-nuxhilated 
lacior*! rivir-i. 


832 


98 


773 


gi5737759 


Homo sapiens 


polyamine modulated factor- 1 (PMFl) 
mKjNA, compieie cos. 


832 


98 


773 


gi5737757 


Homo sapiens 


polyamine modulated factor- 1 (PMFl) 
gene, exons z inrougn d ana conipicce 
cds. 


832 


98 


774 


gi 10440444 


Homo sapiens 


mDKTA A^r 171 TAAA^fi nr/\t-«iirt narfial 

mKiNA lor rLJUuujo proiem, pamai 
cds. 




100 


774 


gi882260 


Homo sapiens 


Human chromatin assembly factor-I 
p60 subunit mRNA, complete cds. 




28 


774 


gi7768767 


Homo sapiens 


genomic DNA, chromosome 21q, 
seciion 02r/iu^* 


86 


28 


775 


gil0437174 


Homo sapiens 


cDNA: FU21135 fis. clone 
CAS07262. 


1236 


99 


775 


AAO013o8 


Homo sapiens 


fiYoc- xiuman poiypcpnuc 
NO 15260. 


1 a/O 


46 


775 


gil0645308 


Leishmania 
major 


L8453.1 


101 


27 


776 


AAB87431 


Homo sapiens 


HUMA- Human gene 14 encoded 
secreted protein fragment, SEQ ID 
NO: 172. 


883 


100 


776 


AAB87398 


Homo sapiens 


HUMA- Human gene 14 encoded 
secretec protein n i JtAM j*t, ocv 
NO: 139. 




inn 


776 


AAB87350 


Homo sapiens 


riUMA- riuman gene 14 encoaea 
secreted protein HTEAM34, SEQ ID 
iNu.yo. 




100 

1 vU 


777 


gil3374939 


Homo sapiens 


Human DNA sequence from clone 
vjri i-zu**nxA on cni ui 1 losonic z.u. 
Contains part of a novel gene, ESTs, 
STSs and GSSs, complete sequence. 


371 


100 


777 


gil2843034 


Mus musculus 


putative 


362 


85 


111 


AAG02702 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6783. 


278 


98 


778 


AAG02713 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6794. 


299 


100 


778 


gi6563166 


Quiscalus 
lugubhs 


NADH dehydrogenase subunit 2 


68 


38 


778 


AAW57056 


Homo sapiens 


CHIL- Class II trans activator (CIITA) 


66 


44 
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polypeptide. 






779 


AAY13108 


Homo sapiens 


vjCo 1 tiunian sccreicQ proiciu ciiuoucu 
by 5' EST SEQ ID NO: 122. 


237 


100 


780 


gil4 124974 


Homo sapiens 


oinuiar to kAjiz i i j gene proaucii 
clone IMAGE:3532726. niRNA, partial 
cds. 


*rv*tO 




780 


gil4602672 


Homo sapiens 


Similar to CG121 13 gene product. 
Clone iMAOtJ. j5r*ojjy, miviN/v, pamai 
cds. 


2702 


100 


780 


gil4603034 


Homo sapiens 


clone MGC:16733 IMAGE:4 129693, 
mRNA, complete cds. 


2557 


100 


781 


gil7223622 


Homo sapiens 


ATP-binding cassette A6 mRNA, 
complete cds. 


721 


100 


781 


AAY57954 


Homo sapiens 


INCY- Human transmembrane protein 
HTMPN-78. 


541 


100 


781 


AAM25936 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO: 1451. 


484 


100 


782 


gi8979818 


Homo sapiens 


Human DNA sequence from clone 
RP3-447E21 on chromosome 6pl2.1- 
21.1 Contains the 5* end of gene sinoilar 
to bovine chloride channel protein 
(p64), a fragment similar to X.lacvis 
Arei/ proieui, a iragmeni sinuiar lo 
Myelin-associated oligodendrocytic 
basic protein (MOBP-81), a novel 
pseucogene, a K^pvj isiana, i s, o i os 
and GSSs, complete sequence. 


954 


100 


782 


gl 1403 1047 


Homo sapiens 


^i^i^jD mKJMA, cumpicic COS. 




100 


782 


gi4588530 


Bos taunis 


chloride channel protein p64 


398 


46 


783 


AAY72161 


Homo sapiens 


dAUU/ Human kxma metaDoiism 
protein fRMEP-l). 




100 


783 


gi4680653 


Homo sapiens 


lajI»u/ proiein mKj\A, conq^ieie cos. 




100 


783 


gii5426434 


Homo sapiens 


CGI-07 protein, clone MGC:13335 
IMAGE:4291797, mRNA, complete 
cos. 


829 


100 


784 


gr7298468 


Drosophila 
melanogaster 


OG 1 5 1 64 gene product 


413 


35 


784 


gl 14026730 


Mesorhizobiu 
m loti 


homoserine kinase 




zo 


784 


gil5075719 


Sinorhizobium 
meiiloti 


PUTATIVE AMINOTRANSFERASE 

r K\J 1 JslIN 


300 


27 


785 


AAM65753 


Homo sapiens 


MOLE- Human bone marrow 
expressea prooe encoaea proicui oc\c 
ID NO: 26059. 


661 


100 


785 


AAM53375 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 25480. 


661 


100 


785 


eil3879308 


Musmusculus 


centromere autoantigen B 


368 


30 


786 


AAY11439 


Homo sapiens 


GEST Human 5' EST secreted protein 
SEQ ID No 261. 


163 


100 


787 


gi9967303 


Macaca 
fascicularis 


hypothetical protein 


297 


96 


787 


AAM55988 


Homo sapiens 


MOLE- Human brain expressed single 


184 


100 
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exon probe encoded protein SEQ ID 
NO: 28093. 






787 


gi7379384 


Neisseria 

meningitidis 

Z2491 


putative piius assembly protein 


68 


36 


788 


gil5080333 


Homo sapiens 


clone MGC:20510 IMAGE:4542472, 
mRNA, complete cds. 


1380 


100 


788 


AAB41490 


Homo sapiens 


CURA- Human ORFX ORF1254 
polypeptide sequence SEQ ID 
NO:2508. 


1267 


81 


788 


gil2698051 


Homo sapiens 


mRNA for KIAA1753 protein, partial 

cds. 


1227 


73 


789 


AAY00277 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 20. 


165 


100 


789 


AAB084S0 


Homo sapiens 


COMP- A human kallilcrein-2 GCLK-2) 
splice variant polypeptide. 


75 


30 


789 


gil4574289 


Caenorhabditis 
elegans 


Hypothetical protein Y37EUC.1 


72 


58 


790 


gil6550493 


Homo sapiens 


cDNAFU31139fis, clone 
IMR322001185. 


1281 


99 


790 


gi3876588 


Caenorhabditis 
elegans 


predicted using Genefinder-<DNA 
EST ykl85al 1 .3 comes from this 
genc--cDNA EST ykl85al 1.5 comes 
from this gene-cDNA EST yk223dl2.5 
comes from this gene-cDNA EST 
yk266b2.5 comes from this 
gene-cDNA EST yk460fl0.5 comes 
from this gene-cDNA EST yk643bl2.3 
comes from this gene-cDNA EST 
yk504b3.5 comes from this 
gene-cDNA EST yk627cl 1 .5 comes 
from this geneM:DNA EST yk643bl2.5 
comes from this genC'-cDNA EST 
yk681bl0.3 comes from this gene 


239 


33 


790 


gi3880607 


Caenorhabditis 
elegans 


cDNA EST yk443f7.5 comes from this 
gene 


109 


37 


791 


gi9837427 


Lytechinus 
variegatus 


embryonic blastocoelar extracellular 
matrix protein precursor 


271 


44 


791 


gil7135842 


Nostoc sp. 
FCC 7120 


ORF_ID:alr73a4^imilar to hlyA 


121 


31 


791 


gi4566524 


Rattus 
norvegicus 


Na+/Ca2+-exclianging protein 
precursor 


120 


32 


792 


gil4250766 


Homo sapiens 


hypothetical protein FLJ21959, clone 
MGC: 1 492 1 IMAGE :4 1 00 1 86, mRNA, 
complete cds. 


2119 


100 


792 


gil 0438 183 


Homo sapiens 


cDNA: FU21959 fis, clone HEP0551 1. 


2119 


100 


792 


AAY36034 


Homo sapiens 


GEST Extended human secreted 
protein sequence, SEQ ID NO. 419. 


1659 


97 


793 


AAB82316 


Homo sapiens 


UYCO Human immunoglobulin 
receptor IRTA3 protein. 


491 


100 


793 


gil 6033594 


Homo sapiens 


SH2 domain-containing phosphatase 
anchor protein 2c mRNA, complete 
cds, alternatively spliced. 


491 


100 
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793 


gil6033591 


Homo sapiens 


SH2 domain-containing phosphatase 
anchor protem 2b mRNA, complete 
cds, alternatively sph'ced. 


491 


100 


"7 Ail 

794 


AAG6684 1 


Homo sapiens 


SHAN- Human dihydroorotase 40. 


710 


99 


794 


gil2052764 


Homo sapiens 


mRNA; cDNA DKF2p564O0523 
(from clone DKFZpS 6400523); 
complete cds. 


703 


98 


794 


ABB 1 2204 


Homo sapiens 


HYSE- Human HSPC304 homologue, 
SEQIDNO:2574. 


698 


98 


795 


AAU12298 


Homo sapiens 


GETH Human PRO9820 polypeptide 
sequence. 


874 


98 


795 


AAH23959_aa 
1 


Homo sapiens 


KYOW Human Klotho cDNA, SEQ E) 

N0:5. 


460 


52 


795 


AAB73618 


Homo sapiens 


KYOW Human Klotho protein encoded 
by SEQ IDN0:5. 


460 


52 


796 


AAU12298 


Homo sapiens 


GETH Human PRO9820 polypeptide 
sequence. 


169 


100 


796 


AAB29903 


Homo sapiens 


HUMA- Human secreted protein 
BLAST search protein SEQ ID NO: 
161. 


83 


40 


796 


gil777770 


Cavia 
porcellus 


cytosolic beta-glucosidase 


83 


40 


797 


AAY13002 


Homo sapiens 


GEST Human secreted protein encoded 
by 5* EST SEQ ID NO: 16. 


222 


100 


798 


AAB65161 


Homo sapiens 


GETH Human PRO203 (UNQ177) 
protein sequence SEQ ID NO:30. 


1901 


100 


798 


AAY66638 


Homo sapiens 


GETH Membrane-bound protein 
PRO203. 


1901 


100 


798 


AAB 19407 


Homo sapiens 


CHIR Amino acid sequence of a human 
secreted protein. 


1896 


99 


799 


gil6306705 


Homo sapiens 


clone MGC:3298 IMAGE:3508400. 
mRNA, complete cds. 


962 


100 


799 


AAY58614 


Homo sapiens 


INCY- Protein regulating gene 
expression PRGE-7. 


571 


69 


799 


AAM42020 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 6951. 


87 


28 


800 


AAY48414 


Homo sapiens 


META- Human prostate cancer- 
associated protein 111. 


191 


100 


800 


gi7293155 


Drosophila 
melanogaster 


CG8916 gene product 


68 


27 


801 


AAG02085 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6166. 


271 


100 


801 


gil6421767 


Salmonella 

typhimurium 

LT2 


DNA biosynthesis; DNA primase 


67 


34 


801 


gil6504287 


Salmonella 
enterica subsp. 
enterica 
serovar Typhi 


DNAprimase 


67 


34 


802 


AAB93911 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:13877. 


335 


97 


802 


AAM91037 


Homo sapiens 


HUMA- Human 

immune/haematopoietic antigen SEQ 


335 


97 
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ED NO: 18630. 






802 


AAG01519 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5600. 


335 


97 


803 


£110438284 


Homo sapiens 


cDNA: FLJ22032 fis. clone HEP08743. 


1485 


99 


803 


gil4017927 


Homo sapiens 


mRKA for KIAA1855 protein, partial 
cds. 


1214 


93 


803 


gi4589614 


Homo sapiens 


mRNA for KJAA0985 protein, 
complete cds. 


140 


31 


804 


AAG00145 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4226. 


225 


95 


804 


AAY07867 


Homo sapiens 


HUMA- Huznan secreted protein 
fragment encoded from gene 1 6. 


225 


95 


804 


AAW71684 


Homo sapiens 


INCY- Amino acid sequence of the 
human tumourigenesis associated 
protein. 


225 


95 


805 


AAB41200 


Homo sapiens 


CURA- Human ORFX ORF964 
polypeptide sequence SEQ ID 
NO: 1928. 


694 


99 


805 


gil2855307 


Mus musculus 


putative 


377 


91 


805 


AAG02I08 


Homo sapiens 


GEST Human secreted protein, SEQ ID 

NO: 6189. 


333 


57 


806 


AAG02252 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6333, 


330 


98 


806 


gi6730714 


Aiabidopsis 
thaliana 


Unknown protein 


68 


38 


806 


gi5729893 


Homo sapiens] 
> [Homo 
sapiens 


A kinase (PRKA) anchor protein 6; A- 
kinase anchor protein 100 


63 


47 


807 


AAB93899 


Homo sapiens 


HELI* Human protein sequence SEQ 
IDNO:13848. 


3873 


99 


807 


gil4042001 


Homo sapiens 


cDNA FU14464 fis, clone 
MAMMA 1000309. 


3873 


99 


807 


gil7512096 


Homo sapiens 


Similar to hypothetical protein 
FU14464, clone IMAGE:4554168. 
mRNA, partial cds. 


2081 


100 


808 


gil 2654201 


Homo sapiens 


clone IMAGE:3449838, mRNA. 
partial cds. 


621 


100 


808 


gil 7068388 


Homo sapiens 


Similar to hypothetical protein 
FU14775, clone MGC:24018 
IMAGE:4105917, mRNA, complete 
cds. 


609 


99 


808 


AAG01516 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5597. 


446 


98 


809 


AAB58340 


Homo sapiens 


ROSE/ Lung cancer associated 
polypeptide sequence SEQ ID 678. 


942 


1 AA 

100 


809 


ABB 11637 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO:2007. 


600 


100 


809 


gil 6878257 


Homo sapiens 


clone MGC:29726 IMAGE:4547604. 
mRNA, complete cds. 


477 


52 


810 


ABB 11722 


Homo sapiens 


HYSE- Human V segment homologue, 
SEQ ID NO:2092. 


382 


59 


810 


giU99646 


Homo sapiens 


Human T cell receptor beta chain 
(TCRB) mRNA, VDJ region, partial 


330 


57 
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cds. 






810 


gi 186400/ 


Uaijitnnx 
jacchus 




328 


56 


811 


gil0437049 


Homo sapiens 


cDNA: FU21047fis, clone 


797 


98 


811 


gi 13880570 


Mycobacteriu 
m tuberculosis 
CDC1551 


conserved hypothetical protein 


79 


35 


811 


gi3261634 


Mycobacteriu 
m tuberculosis 

rU /KV 


nypomeuwai piuicin ixvi/7 / 


79 


35 


812 


Ril2838791 


Mus musculus 


putative 


566 


76 


812 


AAG01260 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5341. 


338 


65 


812 


gi7297946 


Drosophila 
melanogaster 


Cuj433 gene proauci 


0^ 




813 


AAB43507 


Honx) sapiens 


HUMA- Human cancer associated 
proiein sequence ocv^ lu iNv/.y jz. 


1 178 




813 


gi4205084 


Homo sapiens 


Human WW domain binding protein-1 
mRNA, complete cds. 


1378 


98 


813 


gil4603081 


Homo sapiens 


Similar to WW domain binding protein 
1, clone MGC: 15305 

TX4 A * AOO 70 mP XT A rnmnt pf P 

wfuKSjlCf^'tjyjy/, /y, uuuNA, con^ieie 
cds. 


117R 




814 


gll 302004*1^ 


Homo sapiens 


mtviN A lor nypuuicucai proisiii anu 
STS SHGC-2390. 


1854 


100 


814 


gil0439232 


Homo sapiens 


cDNA: FU22729 fis, clone HSI15685. 


793 


100 


814 


gil4290514 


Homo sapiens 


hypotnetical protein rLJZ2/zy, cione 
MGC:16790 IMAGE:4 184795, mRNA, 
complete cds. 


7R0 


QQ 


815 


AAY41454 


Homo sapiens 


HUMA- Fragment of human secreted 
protein encoded by gene 30. 


111 


93 


815 


gi3758843 


Plasmodium 
falciparum 


hypothetical protein, PFC0820w 


71 


26 


815 


gil5025672 


Clostridium 

acetobutylicu 

m 


Carbamoylphosphate synthase large 
subunit 


67 


33 


816 


AAG03514 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7595. 


189 


91 


816 


gil90508 


Homo sapiens 


Human PRB4 locus salivary proline- 
rich protein mRNA, complete cds. 


80 


30 


816 


gil5196112 


human, 

peripheral 

blood 

leukocytes, 

subject 

Genomic 

Mutant, 753 

nt]. [Homo 

sapiens 


PRB4 (PRB4M PO-)=T>arotid V 
protein {exon 3} 


80 


30 


817 


AAY65007 


Homo sapiens 


GEST Human 5* EST related 
polypeptide SEO ID NO: 1 168. 


300 


100 


817 


AAG03529 


Homo sapiens 


GEST Human secreted protein, SEQ ID 


300 


100 
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NO: 7610. 






817 


gil932727 


Homo sapiens 


Human armadillo repeat protein 
mRNA, complete cds. 


64 


59 


S18 


AAG01406 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5487. 


397 


100 


818 


gil2804657 


Homo sapiens 


done IMAGE:3354845, mRNA, 
partial cds. 


397 


100 


818 


gil2841742 


Mus musculus 


putative 


320 


76 


819 


gil 0440259 


Homo sapiens 


cDNA: FU23537 fis, clone 
LNG07690. 


1045 


100 


819 


gi48491 


Vibrio 

parahaemolyti 

cus 


tryptophan synthase; alpha subunit 


74 


35 


819 


gil5155988 


Agrobacterium 
tumefaciens 
str. C58 
(Cereon) 


AGR_C_l792p 


72 


23 


820 


gil0439767 


Homo sapiens 


cDNA: FU23168 fis. clone 
LNG09905. 


1679 


99 


820 


gi31932S0 


Caenorhabditis 
elegans 


Hypothetical protein ZK1055.i 


122 


23 


820 


gil5290033 


Orjrza sativa 


putative myosin heavy chain-like 
protein 


121 


23 


821 


AAB95117 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 17106. 


1515 


100 


821 


gil0434031 


Homo sapiens 


cDNA FU12505 fis, clone 
NT2RM2001699. 


1515 


100 


S21 


gi6056365 


Homo sapiens 


chromosome 14 clone 99E15 
containing gene for KIAA 1036, 
con^lete CDS, complete sequence. 


857 


57 


822 


ABB44606 


Homo sapiens 


SWIT- Human wound healing related 
polypeptide SEQ ID NO 89. 


989 


100 


822 


ABB44607 


Homo sapiens 


SWIT- Human wound healing related 
polypeptide SEQ ID NO 90. 


876 


91 


822 


ABB44596 


Homo sapiens 


SWIT- Human wound healing related 
polypeptide SEQ ID NO 55. 


747 


100 


823 


AAB94920 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 16368. 


760 


100 


823 


gil043281S 


Homo sapiens 


cDNA FLJl 1539 fis, clone 
HEMBA1002748. 


760 


100 


823 


gil 1071808 


Leishmania 
major 


hypothetical protein P2 14.45 


96 


31 


824 


AAE03641 


Homo sapiens 


INCY- Human extracellular matrix and 
cell adhesion molecule-5 {XMAD-5j, 


1599 


100 


824 


gil 5559374 


Homo sapiens 


clone IMAGE:3628973, mRNA, 
partial cds. 


1599 


100 


824 


AAW54090 


Homo sapiens 


TEXA Homo sapiens BE 123 sequence. 


1340 


99 


825 


AAB85771 


Homo sapiens 


INCY- Human drug metabolizmg 
enzyme (ID No. 3861612CD1). 


1587 


100 


825 


gil 6877032 


Homo sapiens 


clone MGC:24011 IMAGE:4091916, 
mRNA, complete cds. 


1573 


98 


825 


AAB73512 


Homo sapiens 


INCY- Human nansferase HTFS-19, 
SEQ ID NO: 19, 


773 


50 
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826 


AAY08477 


Homo sapiens 


ABBO Human BS274 protein epitope 
3. 


181 


100 


826 


AAY08476 


Homo sapiens 


ABBO Human BS274 protein epitope 
2. 


102 


100 


826 


AAY08478 


Homo sapiens 


ABBO Human BS274 protein epitope 
4. 


97 


100 


827 


gi2231329 


Ovis aries 


bactinecin 1 1 


89 


37 


827 


gi3044086 


Myxococcus 
xanthus 


unknown 


89 


35 


827 


AAY41496 


Homo sapiens 


HUM A- Fragment of human secreted 
protein encoded by gene 70, 


88 


37 


828 


gin093911 


Homo sapiens 


Bcl-2 related proline-rich protein 
(BCL2L12) gene, complete cds, 
alternatively spliced. 


1158 


100 


828 


gil4043469 


Homo sapiens 


Similar to RIKEN cDNA 
5430429M05 gene, clone MGC:13155 
IMAGE:4302950, mRNA, complete 

cds. 


1150 


99 


828 


AAW38358 


Homo sapiens 


APOP- Apoptosis associated protein 
Bbk. 


1141 


99 


829 


gil054887 


Homo sapiens 


Human HMGI-C chimeric transcript 
mRNA, partial cds. 


239 


68 


829 


AAG02793 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6874. 


197 


77 


829 


AAG74844 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:5608. 


146 


59 


830 


gii5341178 


Homo sapiens 


lymphocyte alpha-kinasc (LAK) 
mRNA, complete cds. 


472 


100 


830 


AAB56768 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ ID NO: 1346. 


465 


98 


830 


gil2S58085 


Mus musculus 


putative 


412 


85 


831 


gi]0436233 


Homo sapiens 


cDNA FU13936 fis, clone 
Y79AA1000802. 


2754 


100 


831 


AAB95616 


Homo sapiens 


HELI* Human protein sequence SEQ 
IDNO:18326. 


2747 


100 


831 


AAO05842 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 19734. 


687 


97 


832 


gil 5426492 


Homo sapiens 


hypothetical protein FLJ21657, clone 
MGC: 14939 IMAGE:3621124. mRNA. 
conq>lete cds. 


1029 


93 


832 


gil0437800 


Homo sapiens 


cDNA: FU21657 fis. clone 
COL08663. 


1025 


93 


832 


gi7292406 


Drosophila 
melanogaster 


CG 10866 gene product 


263 


35 


833 


AAY66151 


Homo sapiens 


META- Human bladder tumour EST 
encoded protein 9. 


412 


98 


833 


gi6690682 


Rhodobacter 
sphaeroides 


Orn73 


84 


36 


833 


gil4023427 


Mesorhizobiu 
mloti 


maltose-binding protein component of 
ABC sugar transporter 


78 


35 


834 


AAM25486 


Homo sapiens 


HYSE- Human protein sequence SEQ 
IDNO:1001. 


765 


100 


834 


AAV43605 aa 


Homo sapiens 


CHIR Human secreted protein 5 1 352 


39 
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1 




encoding DNA. 






834 


AAY03241 


Homo sapiens 


SAGA Clone HP10484 of a human 
secretory signal protein (2). 


352 


39 


835 


AAB36599 


Homo sapiens 


INCY. Human FLEXHT-21 protein 
sequence SEQ ID NO:21. 


1332 


100 


835 


gi4929699 


Homo sapiens 


CGM 15 protein niRNA» complete cds. 


1332 


100 


835 


gil2846260 


Mus musculus 


putative 


1018 


74 


836 


AAY10855 


Homo sapiens 


HUMA- Amino acid sequence of a 
human secreted protein. 


185 


100 


837 


gi975846 


Bostaunis 


immunoglobulin lambda light chain 
variable region 


74 


33 


837 


gi3411264 


Emericella 
nidulans 


homeodomain DNA-binding 
transcription factor 


70 


58 


837 


gi7299135 


Drosophila 
melanogaster 


Mst85C gene product 


69 


33 


838 


gi9948733 


Pseudomonas 
aeruginosa 


conserved hypothetical protein 


75 


40 


838 


AAB34864 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 1 1 SEQ ID 
NO:68. 


71 


38 


838 


gi6S62167 


Homo sapiens 


mRNA; cDNA DKFZp564M1916 
(from clone DKFZp564Ml916); partial 

cds. 


71 


33 


839 


gil0437026 


Homo sapiens 


cDNA: FU2103lfis, clone 
CAE07336. 


663 


98 


839 


gi7188828 


Gibberella 
ciicinata 


histone H3 


75 


39 


839 


gi5106126 


Aeropyrum 
pemix 


172aa long hypothetical protein 


75 


40 


840 


gil0439719 


Homo sapiens 


cDNA: FU23132 fis, clone 
LNG08559. 


2269 


100 


840 


gil4017917 


Homo sapiens 


mRNA for KIAA1850 protein, partial 
cds. 


2256 


99 


840 


gil 3365945 


Macaca 
fascicularis 


hypothetical protein 


2093 


93 


841 


AAY21589 


Homo sapiens 


GEMY Human secreted protein (clone 
BV278-2). 


470 


100 


841 


AAW52984 


Homo sapiens 


GEMY Homo sapiens clone BV278_2 
protein. 


420 


100 


841 


AAG03462 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7543. 


383 


98 


842 


gil6588712 


Homo sapiens 


P33 mRNA, complete cds. 


1284 


94 


842 


gil4334374 


Homo sapiens 


leucine zipper protein AF5alpha 
rhRNA, complete cds. 


1284 


94 


842 


gil4250169 


Homo sapiens 


Similar to leucine zipper protein 
FKSG14, clone MGC:14847 
IMAGE:3511065, mRNA, complete 
cds. 


1284 


94 


843 


AAB95308 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:17550. 


1366 


99 


843 


gil 0434984 


Homo sapiens 


cDNA FU13114fis, clone 
NT2RP3002603. 


1366 


99 


843 


AAB40721 


Homo sapiens 


CURA- Human ORFX ORF485 


1286 


98 
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polypeptide sequence SEQ ID NO:970. 






844 


gil 2839493 


Mus musculus 


putative 


714 


68 


844 


AAG01527 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5608. 


666 


98 


844 


gi2950243 


Hordeum 
vulgare 


extensin 


77 


31 


845 


gil 0798772 


Homo sapiens 


mRNA for p53AIPl gamma, con^lete 
cds. 


579 


ICQ 


845 


gil0798770 


Homo sapiens 


mRNA for p53AIPlbeta, complete cds. 


257 


100 


845 


gil0798768 


Homo sapiens 


mRNA for p53AIPl, complete cds. 


257 


100 


846 


AAB73675 


Homo sapiens 


INCY- Human oxidoteductase protein 
ORP-8. 


620 


100 


846 


gil2841928 


Mus musculus 


putative 


536 


84 


846 


gil5421813 


Salmonella 
enteritidis 


putative protein 


350 


54 


847 


AAB95773 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO:18713. 


1180 


83 


847 


gil0436616 


Homo sapiens 


cDNA FU14213 fis. clone 
NT2RP3003572. 


1180 


83 


847 


gil4286252 


Homo sapiens 


Similar to hypothetical protein 
FU14213. clone MGC:16218 
IMAGE:3659247, niRNA, complete 
cds. 


681 


100 


848 


gil 655261 6 


Homo sapiens 


cDNA FU32480 fis, clone 
SKNMC2001057. 


2291 


99 


848 


gil 3278954 


• Homo sapiens 


clone IMAGE:3543931, mRNA, 
partial cds. 


1246 


100 


848 


AAB94905 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 16300. 


1155 


99 


849 


AAB48789 


Homo sapiens 


HOSP- Human prostate cancer- 
predisposing protein, CA7 CG04. 


73 


42 


849 


AAM40386 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 3531. 


73 


42 


849 


gil0862762 


Homo sapiens 


Human DNA sequence from clone 
RP4-595C2 on chromosome lq24.1- 
25.3 Contains ESTs, STSs and GSSs. 
Contains the 3* part of the gene for two 
isoforms of the KIAA035I protein and 
the gene for angiopoietin Yl, complete 
sequence. 


73 


42 


850 


gil2248877 


Oiyctolagus 
cuniculus 


mit5ugumin72/junctophilin typel 


2009 


92 


850 


gi9927301 


Mus musculus 


junctophilin type 1 


1971 


91 


850 


gi9886738 


Homo sapiens 


JP3 mRNA for junctophilin type3, 
complete cds. 


14 /J 


0/ 


851 


gil0334802 


Homo sapiens 


fanconi anemia protein E (FANCE) 
mRNA, complete cds. 


2735 


100 


851 


gil2850619 


Mus musculus 


putative 


339 


50 


851 


gi5929884 


Rattus 
norvegicus 


nucleolin-related protein NRP 


103 


24 


852 


AAY59931 


Homo sapiens 


META- Human myometrium tumour 
EST encoded protein 1 1 . 


398 


98 


852 


AAY59934 Homo sapiens 


META- Human myometrium tumour 


215 


70 
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EST encoded protein 14. 






852 


AAY59933 


Homo sapiens 


MET A- Human myometrium tumour 
EST encoded protein 13. 


193 


94 


853 


AAY66147 


Homo sapiens 


MET A- Human bladder tumour EST 
encoded protein 5. 


365 


98 


853 


gi 1 2833738 


Mus musculus 


putative 


71 


52 


853 


gi3293036 


Pseudomonas 
putida 


xcpY 


64 


24 


854 


gil0437476 


Homo sapiens 


cDNA: FU21386 fis. clone 
COL03414. 


1645 


100 


854 


gil7028379 


Homo sapiens 


Similar to hypothetical protein 
FU22792, clone MGC:22933 
IMAGE:4905554, mRNA, complete 
cds. 


1537 


98 


854 


gi791119 


Saccharomyce 
s cerevisiae 


unknown 


81 


26 


855 


AAG62621 


Homo sapiens 


BIOR- Human SNARE protein 25, 


1101 


100 


855 


gi9719422 


Rattus 
norvegicus 


SNARE Vti la-beta protein 


1062 


96 


855 


gi9719420 


Rattus 
norvegicus 


SNARE Vtila protein 


1012 


93 


856 


AAB39312 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 3 SEQ ID 
N0:61. 


315 


98 


856 


AAW88596 


Homo sapiens 


HUMA- Secreted protein encoded by 
gene 63 clone HFEBA88, 


307 


96 


857 


AAU14106 


Homo sapiens 


TRIM- Peptide sequence from human 

c-fos proto-oncoprotein. 


243 


100 


857 


AAR53646 


Homo sapiens 


YEDA c-fos gene product 


243 


100 


857 


gi6518629 


Homo sapiens 


gene for cellular oncogene c-fos, partial 
cds. 


243 


100 


858 


gil 0798770 


Homo sapiens 


mRNA for p53AIPlbeta, conu>lete cds. 


449 


100 


858 


gil 0798768 


Homo sapiens 


mRNA for p53AIPl, complete cds. 


440 


100 


858 


gil 0798772 


Homo sapiens 


mRNA for p53AIPl gamma, con^lete 
cds. 


257 


100 


859 


gil 751 1697 


Homo sapiens 


hypothetical protein FLJ14950, clone 
MGC:31757 IMAGE:50 13235, mRNA, 
complete cds. 


901 


100 


859 


AAB95526 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:18113. 


897 


99 


859 


gil4042838 


Homo sapiens 


cDNA FU14950 fis. clone 
PLACE2000371, weakly similar to 
TENSIN. 


897 


99 


860 


AAG02557 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6638. 


297 




860 


AAG89349 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 469, 


241 


100 


860 


AAB42657 


Homo sapiens 


CURA- Human ORFX ORF2421 
polypeptide sequence SEQ ID 
NO:4842. 


126 


100 


861 


gil2855891 


Mus musculus 


putative 


173 


68 


861 


gi5360235 


Oiyctolagus 
cuniculus 


lectin-like oxidized LDL receptor 


77 


40 
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861 


AAY24153 


Chimeric - 
Homo sapiens 


NISB Bovine L0X*1 extracellular 
region/human IgGl Fc region chimeric 
protein. 


74 


36 


862 


AAY42390 


Homo sapiens 


GEMY Alternative reading frame 
amino acid sequence of lv310_7. 


615 


100 


863 


gi 1 5278033 


Homo sapiens 


nuclear LIM interactor-interacting 
factor, clone MGC: 1 5065 
iMAUc:iDo/olo, mKlSA, complete 
cds. 




00 


863 


gil0257410 


Homo sapiens 


natural resistance-associated 
macrophage protein 1 (SLCllAl) 
gene, con^lete cds, alternatively 
spiicea, ana nuciear lhvi uiieracior' 
interacting factor (NLI-IF) gene, 
complete cds. 




oo 
yy 


863 


gi 10257407 


Homo sapiens 


nuclear LIM interactor-interacting 
tactor (EMLi-irj niKJNA, cornpjeie cos. 


1356 


99 


864 


AAG78191 


Homo sapiens 


SHAN- Human mitochondrial ATPase 
coupling factor 6-14. 


512 


98 


864 


AAG01252 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5333. 


332 


98 


864 


g!l2861731 


Mus musculus 


putative 






865 


gi2323287 


multiple 
sclerosis 
associated 
letrovinis 


polyprotein 


304 


53 


865 


gi38333 


Homo sapiens 


Human endogenous retrovirus pHE.l 


260 


60 


865 


giI7432485 


porcine 

endogenous 

retrovirus 


pol 


254 


47 


866 


gil6041690 


Homo sapiens 


hypothetical protein SP192, clone 

complete cds. 


2544 


100 


866 


gi 10503966 


Homo sapiens 


clone SP192 unknown mRNA. 


2544 


100 


866 


gi 10437401 


Homo sapiens 


cUNA: rUzl^iy iis^cione 




oo 

yy 


867 




— : 

Homo sapietis 


njrpouieucai proieui ri^z j^oi| cione 
MGC:14863 IMAGE:3344580, mRNA, 
complete cds. 


I / 


100 


867 


gil0440321 


Homo sapiens 


cDNA: FU23584 fis, clone 
LNG14307. 


1237 


100 


867 


gi3191978 


Streptomyces 

coelicolor 

A3(2) 


putative protein pll uridylyltransferase 


84 


27 


868 


Ril0438988 


Homo sapiens 


cDNA: FU22558 fis, clone HSI01557. 


841 


100 


868 


gil2852764 


Mus musculus 


putative 


88 


36 


868 


gi7688215 


Homo sapiens 


Human DNA sequence from clone 
RP4-788L20 on chromosome 20 
Contains the HNF3B (hepatocyte 
nuclear factor 3, beta) gene, a novel 
gene, ESTs, STSs, GSSs and five CpG 
islands, complete sequence. 


85 


35 
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869 


AAE02001 


Homo sapiens 


USSH Human viral lAF-associated 
factor (VIAF). 


1044 


86 


869 


AAB43903 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1348. 


1044 


86 


869 


gil2654393 


Homo sapiens 


clone MGC:3062 IMAGE:3344703, 
mRNA, complete cds. 


1044 


86 


870 


gil3384257 


Homo sapiens 


apolipoprotein L5 mRNA, complete 
cds. 


2167 


98 


870 


gi6572236 


Homo sapiens 


Human DNA sequence from clone 
RPM1P2 on chromosome 22 Contains 
the 3' part of the RBM9 gene for RNA 
binding motif protein 9 and the 3' part 
of the gene for a novel protein similar 
to part of APOL (apolipoprotein L) and 
TNF-inducible protein CG12-1 . 
Contams ESTs, STSs and GSSs, 
complete sequence. 


1614 


97 


870 


gil3384259 


Homo sapiens 


apolipoprotein L6 mRNA, coaq>]ete 
cds. 


478 


39 


871 


gil0732650 


Homo sapiens 


PP3111 mRNA, complete cds. 


452 


63 


871 


gi5051823 


Amycolatopsis 
orientalis 


putative peptide synthetase 


72 


30 


871 


gi2894188 


Amycolatopsis 
orientalis 


PCZA363.3 


72 


30 


872 


gil0438351 


Homo sapiens 


cDNA: FU22087 fis, clone HEP15918. 


3942 


100 


872 


gil 0438800 


Homo sapiens 


cDNA: FLJ22417 fis, clone 
HRC08579. 


3937 


99 


872 


gil 3278208 


Mus musculus 


Similar to hypothetical protein 
FU22087 


3410 


86 


873 


AAO10235 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 24127. 


810 


99 


873 


gil 1244871 


Homo sapiens 


dioxin receptor repressor (AHRR) 
gene, exon 12 and complete cds. 


784 


89 


873 


gi6330736 


Homo sapiens 


mRNA for KIAA1234 protein, partial 
cds. 


776 


88 


874 


AAB94957 


Homo sapiens 


HELI- Himian protein sequence SEQ 
IDNO:16495. 


732 


100 


874 


gil0433031 


Homo sapiens 


CDNAFU11715 fis, clone 
HEMBA 1005223. 


732 


100 


874 


gi7620533 


Bradyrhizobiu 
mjaponicum 


unknown 


80 


26 


875 


gil2652943 


Homo sapiens 


clone MGC:2488 IMAGE:3351245, 
mRNA, complete cds. 


1031 


100 




gllZUjJJU/ 


Homo sapiens 


mKINA, CL/lNA UJ\l'^p4j*HZUV v^ITOm 

clone DKFZp434I209); complete cds. 


iVj 1 


inn 


875 


eil2846815 


Mus musculus 


putative 


805 


78 


876 


AAG03976 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 8057. 


457 


92 


876 


gi5922723 


Rattus 
norvegicus 


KPL2 


73 


35 


876 


gil6604679 


Aiabidopsis 
thaliana 


putative WD-repeat membrane protein 


67 


31 


877 


817959931 


Homo sapiens 


PR02893 


351 


100 
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877 


gi7544787 


Sus scrofa 


glycoprotein ZPl 


68 


33 


877 


gi347421 


Sus scTofa 


zona peiiuciaa giycoproieui 


68 


33 


878 


AAM41443 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 

INU o5 


287 


83 


878 


AAM39657 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2802. 


287 


83 


878 


AAM82707 


Homo sapiens 


H.UMA- numan 

immune/haematopoietic antigen SEQ 
iU NO:luJ0U. 




0 J 


879 


AAB68986 


Homo sapiens 


UYJO Human polyamine-modulated 
iactor-1 rMr-i. 


749 


98 


879 


gi5737759 


Homo sapiens 


polyamine modulated factor-l (PMFl) 
mKNA, complete cos. 


749 


98 


879 


gi5737757 


Homo sapiens 


polyamine modulated factor-l (PMFl) 
genC) exons 2 through 5 and complete 

cds. 


749 


98 


880 


AAY 14462 


Homo sapiens 


nUMA- numan secreteo proiem 
encoded by gene 52 clone HFIUR35. 




98 


880 


gi6729212 


Clostridium 
botulinum 


KrNHA 


O/ 


j"t 


880 


gi7240602 


Clostridium 
botulinum 


progenitor toxin L nontoxic- 
nonhemagglutinin component 


65 


34 


881 


AAB941 10 


Homo sapiens 


ncLl- numan proiem sequence ocy 
ID NO: 14346. 




99 


881 


gil043408o 


Homo sapiens 


ciilM A ru 1^3^^ lis. Clone 
NT2RM4000534. 


1 


99 


881 


AAG02676 


Homo sapiens 


GEST Human secreted protein, SEQ ID 

inij: O/j /. 


210 


97 


882 


gi9956045 


Homo sapiens 


clone CDABP0066 mRNA sequence. 


894 


94 


882 


gi3413800 


Homo sapiens 


Homo sapien mRNA for putative 
secnetoiy protein, hBET3. 


Off 


Od 


882 


gi2791804 


Homo sapiens 


bet3 (BdT3) mRNA, complete cos. 




04 


883 


gi579068 


Bacteriophage 
phi-80 


y»TT MAMA / A A 1 11 ^^ 

cu gene ^AA l - \5L) 


0 J 1 




883 


gil2516141 


Escherichia 
coli0157:H7 


unknown protein encoded within 
prophage CP-933U 


lUZ 




883 


gil33o2232 


Escherichia 
coU0157:H7 


—r 7 — : — — ■ — 

hypothetical protein 




36 


884 


gl7303563 


Drosophila 
melanogaster 


lajVUUj gene proauci 


f 0 


33 


884 


gll28oio39 


Mus musculus 


putative 


r U 


32 


884 


gilG241798 


Streptomyces 
coelicolor 


hypothetical protein SCE4 1.24c 


75 


33 


885 


gil7059636 


Homo sapiens 


Novel human gene mapping to 
chomosome 22. 


2527 


99 


885 


gil4594694 


Mus musculus 


adiponutrin 


1419 


67 


885 


AAY53641 


Homo sapiens 


CHIR A bone marrow secreted protein 
designated BMS42. 


880 


45 


886 


AAY36025 


Homo sapiens 


GEST Extended human secreted 
protein sequence. SEQ ID NO. 410. 


198 


94 


886 


AAY11423 


Homo sapiens 


GEST Human 5' EST secreted protein 
SEQ ID No 245. 


137 


100 
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887 


gi687590 


Homo sapiens 


Human (AFlq) mRNA, complete cds. 


431 


93 


887 


gi 16307092 


Homo sapiens 


ALLl -fused gene from chromosome 
Iq, clone MGC: 17309 
IMAGE:3878959, mRNA, complete 
cds. 


431 


93 


887 


gi 14250081 


Homo sapiens 


ALLi -fused gene finom chromosome 
Iq, clone MGC:14664 
IMAGE:410348S, mRNA, complete 
cds. 


431 


93 


888 


AAG7408S 


Homo sapiens 


HUMA- Human colon cancer antigen 
protein SEQ ID NO:4849. 


286 


94 


888 


gil4043788 


Homo sapiens 


clone MGC: 14288 IMAGE:4135996, 
mRNA, complete cds. 


286 


94 


888 


AAY36036 


Homo sapiens 


G£ST Extended human secreted 
protein sequence, SEQ ID NO. 421. 


281 


92 


889 


gil6550275 


Homo sapiens 


cDNA FU30968 fis, clone 
HEART2000411. 


1018 


98 


889 


AAM75969 


Homo sapiens 


MOLE- Human bone manow 
expressed probe encoded protein SEQ 
ID NO: 36275. 


661 


iOO 


889 


AAM63155 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 35260. 


661 


100 


890 


gil3559062 


Homo sapiens 


Human DNA sequence from clone 
RP11-552M1I on chromosome 1. 
Conmins the OVGPl gene for 
oviductal glycoprotein 1 (mucin 9, 
oviductin), three novel genes, the 
ATP5F1 gene for mitochondrial FO 
con^lex H+ transporting ATP synthase 
bl, the AD0RA3 gene for adenosine 
A3 receptor and an RPS29 (40S 
ribosomal protein S29) pseudogene. 
Contains ESTs, STSs, GSSs and two 
CpG islands, con^lete sequence. 


667 


100 


890 


AAY59703 


Homo sapiens 


GEST Secreted protem 47-2-3-G9-FL2. 


509 


97 


890 


AAY11473 


Homo sapiens 


GEST Human 5' EST secreted protein 
SEQ ID No 295. 


472 


94 


891 


gil0439198 


Homo sapiens 


cDNA: FU22704 fis, clone HSI 12602. 


1336 


100 


891 


gi 16877288 


Homo sapiens 


Similar to Hermansky-Pudlak 
syndrome 3, clone MGC:2I006 
IMAGE:44 15076, mRNA, conq>lete 
cds. 


1191 


100 


891 


gil6552016 


— : 

Homo sapiens 


cuNA rLJjzuij lis, clone 
NTONG1000033. 


I iVi 


f 


892 


AAB27247 


Homo sapiens 


INCY- Human EXMAD-25 SEQ ID 

NO: 25. 


2242 


100 


892 


gil3938404 


Homo sapiens 


clone MGC: 1526 IMAGE:2989807, 
mRNA, complete cds. 


1544 


99 


892 


gil50il984 


Homo sapiens 


bystin mRNA, complete cds. 


1532 


99 


893 


AAG03168 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7249. 


415 


97 


893 1 gi59ll457 


Pseudomonas 


pyochelin synthetase PchF 


75 


51 
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aciuginosa 












Hoino sspiens 


nign*inoDiUiy giuup ^uuiuuaiunc 
cnroinOdviiiai/ ptviwiii lauLuiina i oiiu 
Y, clone MGC:4242 IMAGE:2962998, 

mRNA miTtnlete cds 






894 






RO^FV HiifnaTi nrn'Jtate cancer anficen 

XV^^^Cr flUIlUUi VCUIw^t OklUK^Il 

protein sequence SEQ ID NO:995. 


890 


97 






Ufiffvm csnipnc 
numu aa|licio 


fTiMP- A human kallikrein-2 (KLK^li 
Splice variant polypeptide. 


257 


52 


02r*t 


A AVO^HM 


noniD Sapiens 


AT VTiiman cAnrpt^rl nrAtf^in vn^ 1 
/vi^m*' jnunuui ac^icicu |muicui vpj^i> 

SEQIDNO:68. 


228 


67 




o« 1 1 n^ASAQ 
gl 1 1 UJ*loU7 


numo Sopiens 


ieucine~zippcr proivui rivovji.^ 


ly 1*9 


QQ 

yy 


895 


gi2674195 


Mus musculus 


polymerase I-transcript release factor; 
PTRF 


1779 


92 






VJallUS gaiius 


leucine zipper proiein 


I J 1 1 




896 


gil2697951 


Homo sapiens 


mRNA for KIAA1703 protein, partial 
cds. 


1130 


98 


896 


AAB94772 


Homo sapiens 


HELI- Human protein sequence SEQ 

lU JNly: 1^505. 


1002 


99 


896 


gil0435978 


Homo sapiens 


cDNAFUl3839fis. clone 
in, I Kwivuu/ / f . 


1002 


99 


897 


AAY87322 


Homo sapiens 


INCY- Human signal peptide 
cuniaiiiiijg proicjii xioi^a mizs^ lu 
NO:99. 


888 


100 






tioino sapiens 


riwiVLA- riuinan sccreieu proicin, ocv^ 
roNO: 191. 


871 




5107 




nomo Sapiens 


vjco 1 nuiuan sccrcicQ proieiii, dcv^ lU 
NO: 7711. 




07 






nowD sapiens 


uca 1 JDAicnaea nunian secreieo 
protein sequence, SEQ ID NO. 202. 




OS 


oVo 




Homo sapiens 


Kjo^ I cxieiKiea numan secreteci 
protein sequence, SEQ ID NO. 490. 




o'; 

yO 


fiOfi 

oyo 




Homo sapiens 


vft^o 1 nunian secreiea protein, ot^v 
NO: 4706. 


zov 


OB 


899 


AAY64868 


Homo sapiens 


GEST Human 5' EST related 
puiypcpnuc ocv^ lu iNvj.iuzy. 


486 


97 


900 


AAG00723 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
MO" 4Rn4 

i^w. *tov/*t. 


304 


100 


900 


gi330019 


Hepatitis E 
virus 


Structural viral protein 


69 


54 


900 


gi4I8310 


Hepatitis E 

VITUS 


STRUCTURAL PROTEIN 1 > 


69 


54 


901 


gil5779204 


Homo sapiens 


hypothetical protein FLJ 12448, clone 
MGC:22955 IMAGE:4860511, mRNA, 
complete cds. 


1318 


100 


901 


AAB94014 


Homo sapiens 


HELI- Human protein sequence SEQ 

ID NO: 14 138. 


1302 


99 


901 


gil0433939 


Homo sapiens 


cDNAFU12448fiS, clone 
NT2RM1000300. 


1302 


99 


902 


AAB94507 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 152 14. 


1220 


100 


902 


gil0435098 


Homo sapiens 


cDNAFU13188fis, clone 
NT2RP3004246. 


1220 


100 
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902 


gl 10439320 


nonio Sapiens. 


rONA- FIJ99Q77 fi^ clone 
KAT11312. 


1201 


99 


903 


gi 145 17331 


Homo sspicns 


ICMlo'nJCVClupillClll IvlalwU 1 

SP20D mRN A, complete cds. 


672 


98 


903 


gi 145 17329 


Homo sapiens 


ICbUa'UCVClV^piIlwIll iwlalwU i a/" 

^P^nPtnRMA rATnnlete Ciis 
OA lluViN/\) uuiupiwiv 


672 


98 


903 


gil4039851 


Homo sapiens 


testes development-related N YD-SP20 
mRNA, complete cds. 


672 


98 


904 


AAW88724 


Homo sapiens 


riuiYiA* dcCFciCQ pruicin wiiwuucu vy 
gene 191 clone HJABZ65. 


373 


98 


904 


gil3592l78 


Leishmama 
major 


ocnnc 1 orcoiiiiic r^iuicm xvuiaov ijav 

protein 4 


68 


36 


904 


AAO01200 


Homo sapiens 


fi X oU* numaii poi ypepiluc oijv^ lu 
NO 15092. 


66 


62 


905 


—11 A/1 1 AAO C 

gll 04390 w 


Homo sapiens 


/»n>JA- PT MO^ld fic rlnnp HSTO^Q^l 
(rUIN/\. FLJZZOZ** lis, ciuiic noi.yjjy'j I . 


1749 


100 


905 


gi 13938004 


Homo sapiens 


Similar to nypomciicai proiem 
FT n7f\7d niAnp TMAriF*4 104832 

mRNA, partial cds. 


1500 


99 


905 


AAM3oo3l 


Homo sapiens 


UTTKilA Unman r»nlnr»rta1 t^anf^ 
Il\Jjyi/\- nUHlAil wOllliCUlai wauwci 

antigen SEQ ID NO: 146. 


714 


98 


906 


AAWoVoio 


Homo sapiens 


UT TNA A Uitvnan CA/^r^fprl nrntAtn 
riwIVlA'- nUIIlall SCClClCU piUIClll 

pnrTiHpH Kv opnp pIatip fTT Xf^Jfi*^ 


448 


95 


906 


gi 15029372 


Homo sapiens 


sorbin polypeptide mRNA, complete 

pHc 


80 


31 


906 


gl 1 ZooU /ZZ 


jn.us niuscuius 


puuiiivc 


80 


30 


907 


gi 1 2854928 


Mus muscutus 


putative 


688 


82 


907 


gl 103320 J 1 


Homo sapiens 


SMINT1000054. 


592 


100 


907 


AADjjyUO 


nomo sapiens 


UT fAA A_ Unman i^Alnn rflnrpr unttQGn 

protein sequence SEQ ID NO: 1446. 


491 


98 


908 


A A/^OIIAQ 

AAvjyiJOy 


Homo sapiens 


MT^P Tinman nrntpin HP 10^60 

iNlkjL'- nUlIlJlIl piUlCLlI iil l\fJ\J\J, 


598 


100 


908 


gi9954173 


Homo sapiens 


DNA polymerase delta smallest subunit 


598 


100 


908 


gi 12845953 


Mus musculus 


putative 


492 


83 


909 


AA Y4oZ33 


Homo sapiens 


IV^PHTA Tinman nmetatp f^sn/^pr^ 

lYic nuniaii piusuiic caiiuci~ 

associated protein 39. 


334 


100 


909 


gt6458749 


Deinococcus 
radioduians 


hypothetical protein 


70 


38 


909 


gl 1420437 


Saccharomyce 
s cerevisiae 


UKr YUKlolW 


fiifk 

uu 




910 


AAY48598 


Homo sapiens 


META- Human breast tumour- 
associated protein 59. 


370 


98 


910 


gil3424450 


Caulobacter 
crescentus 


hypothetical protein 


68 


32 


910 


gil 5833006 


Escherichia 
coli0157:H7] 
> [Escherichia 
coli0157:H7 


hypothetical protein 


66 


39 


911 


gil6553936 


Homo sapiens 


cDNA FU25219 fis. clone STM00503. 


667 


100 


911 


gil4250164 


Homo sapiens 


Similar to RKEN cDNA 2310030G06 
gene, clone MGC: 14839 
IMAGE:4294167. mRNA, conq}letc 
cds. 


667 


100 
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911 


AAG00856 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4937. 


488 


98 


912 


AAG01735 


Homo sapiens 


GEST Human secreted protein, SEQ ID 

NO: 5816. 


349 


98 


912 




tvauUS 


intrtneir* farinri^T^ \ 0 n<*r^ntnr nrecimor 


67 


33 


912 


gi996854S 


Narcissus 

ti c^i 1 H An 9 ri* tcci 1 
paCUUUllfiUVlaaU 

S 


beta-caiDtene hydroxylase 


65 


33 


oil 




noino Sapiens 


rlnni* MGT- 10820 IMAGE'36 13742 


373 


100 


913 


AAM91638 


Homo sapiens 


HUMA- Human 

imtnnnf^aematonoietic antiffen SEO 
roNO:19231. 


352 


91 


913 


gi36424 


Homo sapiens 


Human sec oncogene for SEC protein. 


308 


84 


914 


AAM79478 


Homo sapiens 


HYSE- Human protein SEQ ID NO 


386 


54 


914 


AAM78494 


Homo sapiens 


HYSE- Human protein SEQ ID NO 

1 lOU* 


386 


54 


914 


AAB28200 


Homo sapiens 


CORI- Human xs99. 


386 


54 






Homo sapiens 


mRNA. complete cds. 


645 


100 


915 


gil5929794 


Homo sapiens 


Similar to RNA polymerase 1-3(16 

IMAnF-2R47fiSI mRNA comolete 
cds. 


645 


100 


915 


gil2805135 


Mus musculus 


Unknown (protein for 


492 


78 


916 


gil2698063 


Homo sapiens 


mRNA for KIAA1759 protein, partial 


3964 


99 


OK 


At 1 "iAC^OA^ 

giizujzyo^ 


xiomo sapiens 


^'from clone DKFZd566M1046)- 
complete cds. 


3929 


97 


y lO 






cDNA- FLJ22665 fis clone HSI08219. 


3691 


99 


017 






mRNA- cDNADKFZd566M1046 
(from clone DKFZp566M1046); 
comolete cds 


4028 


99 


917 


gil2698063 


Homo sapiens 


mRNA for KIAA1759 protein, partial 

cds. 


3939 


98 


917 


gil0439143 


Homo sapiens 


cDNA: FU22665 fis, clone HSI08219. 


3666 


97 


yio 






clone CDABP0066 mRNA sequence. 


270 


56 






fxuiiiu aa|Jtwu9 


Homn fianien mRNA for outative 
secretory protein, hBET3. 


270 


56 


918 


gi279l804 


Homo sapiens 


bet3 (BET3) mRNA, complete cds. 


270 


56 


919 


gil3925848 


Homo sapiens 


kelch-like protein KLHL4c mRNA, 
complete cds, alternatively spliced 


765 


81 


919 


gil3925845 


Homo sapiens 


kelch-like protein KLHL4 mRNA, 
complete cds, alternatively spliced. 


765 


SI 


919 


gil2697919 


Homo sapiens 


mRNA for KIAA1687 protein, partial 
cds. 


765 


81 


920 


gil3l85301 


Homo sapiens 


unnamed protein product 


871 


100 


920 


gi 14043484 


Homo sapiens 


Similar to RIKEN cDNA 2810021014 
g^ne, clone MGC:131S9 


711 


100 
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IMAGE:4303698, mRNA, complete 
cds. 






920 


gi 12850457 


Mus musculus 


putative 


702 


81 


yZl 




r ncuniocysiia 


kpYin-like serine endonrotease 


71 


75 


921 


AAO05346 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 19238. 


70 


62 




gi 1 / 0U7Z3 


numau 
herpesvirus 5 


HCMVIRM = TRL4 


70 


61 


922 


AAcyj /ou 


nouK) Sapiens 


I'TPT T. HnmAn nmteiti ^eauence SHO 

nCL>l~ nUlllall ^AVivUi ow^wVllVW h7Xrf>^ 

ID NO: 1 3446. 


1529 


100 


922 


gllU<l3ZoOU 


noniD Sapiens 


rFlKA Fl I1 1 S77 fi« clone 

HEMBA1003556. 


1529 


100 


922 


gl I Z53DJ4D 


jMus muscuius 


puianvc 


1257 


83 


923 


gil3177760 


Homo sapiens 


hypothetical protein FLJ2 1324, clone 
complete cds. 


1220 


99 


923 


1 ri/i "iT/im 
gllU43 /4U / 


Homo sapiens 


COL02394. 


1217 


99 


923 




nonio sapiens 


141 TKA A. Human r*anr>pr 9CCnci9tf>H 

protein sequence SEQ ID NO:988. 


1216 


99 


924 




Homo sapiens 


HRC02873. 


902 


100 


924 


gll 4515442 


Caenoiiiabditis 
elegans 


nypomcucai proiein 


85 


29 


924 


AAB84577 


Homo sapiens 


UYEM- Amino acid sequence of a 

in^hir^ hiimstn FT'? npntinp 


77 


37 


925 


AAB95224 


Homo sapiens 


HELI- Human protein sequence SEQ 
11/ IN 1 f 


837 


99 


925 


gi 10434642 


Homo sapiens 


CDNAFU12891 fis. clone 
NT2RP2004142. 


837 


99 


925 


gll O950l 1 


Neurospora 
crassa 


rAfofm^ tn T n QMAT T MI TPT FAR 
reiaico 10 u 1 oivi/vLfi^ in uv^j^e/viv 

RTROMTTPT FOPTiOTFTN C 


83 


50 


926 


AAE06150 


Homo sapiens 


HUMA- Human gene 14 encoded 
secreted protein fragment, SEQ ID 
xjno 1 7 


837 


100 


926 


AAY87173 


Homo sapiens 


HUMA- Human secreted protein 
<pntipnrp ^FD ID NO-? 1 7 


837 


100 


926 


AAE06151 


Homo sapiens 


HUMA- Human gene 14 encoded 

cpi^vptp/l nrrttptti ft'Aompnt SFll II) 
accrcicu pruLCiu itagiuciiiy \js^\^ xx^ 

NO:213. 


212 


100 


927 


gi tyDyy 1 / 






816 


100 




oi 1 A/U\'X t ft? 
gli40Uj 10/ 


nomo Sapiens 


livnnfliptirsil nrntpin PR076nS clone 

MGC: 19796 IMAGE:3845525. mRNA. 
coziq)lete cds. 


642 


100 


927 


AAY66180 


Homo sapiens 


META- Human bladder tumour EST 
encoded protein 38. 


370 


84 


928 


gil89989 


Homo sapiens 


Human protein kinase C-L (PRKCL) 
mRNA, complete cds. 


301 


72 


928 


gi56916 


Rattus 

norvegicus 


protein kinase 


286 


67 


928 


gi220527 


Mus musculus 


nPKC-eta 


286 


67 


929 


AAG03419 


Homo sapiens 


GEST Human secreted protein, SEQ ID 


273 


100 
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ri\J, OvU. 






929 


gi2746865 


Caenorhabditis 

elcgans 


Hypothetical protein T05A8.4 

* 


67 


34 


930 


gil3879555 


Mus musculus 


binder of Rho GTPase 3 


612 


79 


930 


gi5731209 


Mus musculus 


LRIB-containmg dUKu3 protem 


DlZ 


70 


930 


gu 28421 66 


Mus musculus 


putative 




7Q 

fy 


931 


gil40l7917 


Homo sapiens 


mRNA for KIAA1850 protein, partial 

COS. 


3878 


99 


931 


gil3365945 


Macaca 
fascicularis 


hypothetical protein 


2320 


94 


931 


£110439719 


Homo sapiens 


cDNA: FU23132 fis, clone 
LNG08559. 


2256 


99 


932 


gil0440377 


Homo sapiens 


mRNA for FLJ00024 protein, partial 
cds. 


937 


99 


932 


gil0440377 


Homo sapiens 


FLJ00024 protein 


937 


99 


933 


gil 5207959 


Macaca 
fascicularis 


hypothetical protein 


632 


88 


933 


gi552009 


Streptococcus 
pyogenes 


peptidase 


97 


25 


933 


gil3623022 


Streptococcus 
pyogenes Ml 
GAS 


C5A peptidase precursor 


95 


24 


934 


gil2860619 


Mus musculus 


putative 


609 


96 


934 


AAM74162 


Homo sapiens 


MOLE- Human bone manow 
expressed probe encoded protein SEQ 
ID NO: 34468. 


182 


9/ 


934 


AAG03513 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7594. 


136 


96 


935 


AAW88598 


Homo sapiens 


HUMA- Secreted protein encoded by 
gene 65 clone HFVHY45. 


Ann 
400 




935 


gil 2862020 


Mus musculus 


putative 


269 




935 


AAW88821 


Homo sapiens 


HUMA- Polypeptide fragment encoded 
by gene 65. 


148 


100 


936 


AAY60495 


Homo sapiens 


META- Human normal bladder tissue 
EST encoded protem 167. 


326 


98 


937 


AAG81401 


Homo sapiens 


ZYMO Human AFP protein sequence 
SEQIDNO:320. 


551 


100 


937 


AAG93300 


Homo sapiens 


NISC- Human protein HP10417. 


551 


100 


937 


AAB43646 


Homo sapiens 


HUMA- Human cancer associated 
protein sequence SEQ ID NO: 1 091 . 


551 


100 


938 


AAY 17388 


Homo sapiens 


INCY- Human vesicle membrane 
protein-like protein 1 . 


950 


90 




giyouziwo 


Homo sapiens 


nypouieiicai proiem odoiiu nuviNAi 
complete cds. 




on 


938 


gi8745394 


Homo sapiens 


Alu co-repressor I (ACRl) mRNA, 
complete cds. 


950 


90 


939 


AAG78658 


Homo sapiens 


BODE- Human peroxidase 
antioxidising enzyme 24. 


303 


60 


939 


AAG04043 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 8124. 


303 


60 


939 


AAY17388 


Homo sapiens 


INCY- Human vesicle membrane 
protein-like protein 1. 


303 


60 
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940 


AAY17388 


Homo sapiens 


INCY- Human vesicle membrane 
protein-like protein I. 


805 


79 


940 


gi9802048 


Homo sapiens 


hypothetical protein SBBIIO mRNA, 
complete cds. 


805 


79 


940 


gi8745394 


Homo sapiens 


Alu co-repressor I (ACRl) mRNA, 
complete cds. 


805 


79 


941 


gil7132972 


Nostoc sp. 
PCC7120 


ORF_ID:aU3838-similar to kinesin 
Hght chain 


100 


25 


941 


gil335276 


Homo sapiens 


Human PRB3 gene (PRB3S) for Gl 
protein, exon 3. 


94 


24 


941 


gil335274 


Homo sapiens 


Human pibl gene for salivary proline- 
rich protein, exon 3. 


93 


22 


942 


AAY22155 


Homo sapiens 


S AKA/ Human Nek associated protein 
1. 


3552 


59 


942 


gi4760464 


Homo sapiens 


mRNA for Nck-associated protein 1 
(Napl), complete cds. 


3552 


59 


942 


gil5929137 


Homo sapiens 


NCK-associated protein 1, clone 
MGC:8981 1MAGE:3907646, mRNA, 
complete cds. 


3552 


59 


943 


gi54004 


Mus musculus 


put RP2 protein (aa 1-357) 


1210 


63 


943 


gi7298591 


Drosophila 
melanogaster 


CG10194 gene product 


472 


34 


943 


gi7298588 


Drosophila 
melanogaster 


CGI 01 95 gene product 


381 


31 


944 


gil 7389434 


Homo sapiens 


hypothetical protein FLJ22639, clone 
MGC:22172 IMAGE:4700838, mRNA, 
complete cds. 


876 


100 


944 


gil0439108 


Homo sapiens 


cDNA: FU22639 fis. clone HSI06816. 


876 


100 


944 


AAG98701 


Homo sapiens 


COGE- Human cell death protective 
cDNA clone CNI-00717 0RF5 protein, 
SEQ: 194. 


72 


28 


945 


AAB95692 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:18510. 


1163 


100 


945 


gil0436474 


Homo sapiens 


cDNA FU14100 fis, clone 
MAMMA1O00855. 


1163 


100 


945 


gi7020531 


Homo sapiens 


cDNA FLJ20433 fis, clone KAT03767. 


75 


25 


946 


AAB 15389 


Homo sapiens 


TOYJ Human interleukin 6 receptor 
protein. 


86 


26 


946 


gi4699964 


Homo sapiens 


PAG clone RP5.953A4 from 7ql 1.23- 
q21.1, complete sequence. 


85 


25 


946 


gi896310 


Mamestra 
brassicae 
nucleopolyhed 
rovinis 


unknown protein 


84 


32 


947 


AAY12607 


Homo sapiens 


GEST Human 5' EST secreted protein 
SEQ ID NO: 272 from WO 9906553, 


395 


98 


947 


gil7223776 


Mus musculus 


MLLT6 


76 


33 


947 


gi7297961 


Drosophila 
melanogaster 


nub gene product 


71 


34 


948 


gil 7046389 


Homo sapiens 


C21orf70 isoformB protein (C2lorf70) 
mRNA. complete cds. alternatively 
spliced. 


695 


71 


948 


gil7046387 


Homo sapiens 


C2lorf70 isoform A protein (C21orf70) 


670 


66 
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/o 

Tdentitv 








mRNA, con^lcte cds, alternatively 
spliced. 






948 


gil4424633 


Homo sapiens 


clone MGC:16722 IMAGE:4 128732. 
mRNA, complete cds. 


670 


66 


949 


gil5779053 


Homo ss^iens 


Similar to RIKEN cDNA 6720467C03 
gene, clone MOCzooiV 
IMAGE:4826612, mRNA, complete 

cds. 


869 


too 


949 


gil 2859857 


Mus musculus 


putative 


777 
Iff 


oo 


949 


AAG02322 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6403. 


630 


99 


950 


AAG89289 


Homo sapiens 


GEST Human secreted protein, SEQ ID 

NO: 409. 


374 


98 


950 


AAY45307 


Homo sapiens 


HUM A- Human secreted protein 
fragment encoded from gene 15. 




OR 


950 


gi6523815 


Homo sapiens 


phosphotidylethanolamine N- 
metnyltransierase JFNM i ) mKN a, 
complete cds. 


'XIA 


OR 


952 


AAB94360 


Homo sapiens 


HELI- Human protem sequence dci^ 
lU INU.i4fio/. 






952 


gil0434636 


Homo sapiens 


cDNA FU 12888 fis, clone 

IN 1 zKrZllWUo 1 - 


3208 


99 


952 


gil 2855328 


Mus musculus 


putative 


2247 


72 


953 


gi476224 


Homo sapiens 


Human anion exchanger 3 cardiac 
isoform (cAE3) mRNA, partial cds. 


jyy 


100 


953 


gi 10953762 


Mus musculus 


anion exchanger 3 cardiac isoform 


JO J 




953 


gi202771 


Ramisrattus 


GRF-cardiac specific 5' coding region; 
putative 


233 


63 


954 


gil 2850828 


Mus musculus 


putative 


173 


75 


954 


gi203519 


Rattus 
norvegicus 


cytochrome c oxidase subunit Vic 


IT) 


7'> 


954 


AAM23875 


Homo sapiens 


HYSE- Human EST encoded protem 
SEQ ID NO: 1400. 


JOl 


70 


955 


AAY36057 


Homo sapiens 


GEST Extended human secreted 
protein sequence, SEQ ID NO. 442. 


313 


88 


955 


AAY35931 


Homo sapiens 


GEST Extended human secreted 
protein sequence, SEQ ID NO. 1 80. 


295 


100 


955 


AAY11851 


Homo sapiens 


GEST Human 5 EST secreted protem 
SEQ ID No: 451. 


w 1 


// 


956 


gil 6549966 


Homo sapiens 


cDNA FLJ30707 fis, clone 
FCBBF2001211. 


2/5/ 


oo 
yy 


956 


AAM77437 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
ID NO- 37743 






956 


AAM64659 


Homo sapiens 


MOLE- Human brain expressed single 
exon probe encoded protein SEQ ID 
NO: 36764. 


658 


100 


957 


gil 655 1351 


Homo sapiens 


cDNAFU31509fis, clone 
NT2RI1000016. 


1226 


100 


957 


gil4133227 


Homo sapiens 


mRNA for KIAA0970 protein, partial 
cds. 


938 


98 


957 


AAG02178 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6259. 


738 


98 
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958 


gi3800830 


Rattus 
tiorvefficus 


putative four repeat ion channel 


711 


83 


958 


Kil790I375 


Homo sapiens 


unnamed protein product 


711 


83 


958 


gi7292976 


Drosophila 
melanogaster 


CG1517 gene product 


382 


44 


959 


AAYouUo^ 


rioino Sapiens 


MPTA- Human endometrium tumour 
EST encoded orotein 123. 


235 


97 


959 


AAY60064 


Homo sapiens 


META- Human endometrium tumour 
EST encoded protein 124. 


231 


97 


959 


gil5081715 


Arabidopsis 

fhaliana 


At2g41840/rilA7.6 


81 


36 


960 


gil3 177691 


Homo sapiens 


Similar to RIKEN cDNA 2410047102 
gene, clone MGC:2S60 
IMAGE:2989772, mRNA. con^letc 
cds. 


689 


100 


960 


gil2858411 


Mus musculus 


putative 


585 


86 


960 


AAG01650 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5731. 


270 


98 


961 


gi798l304 


Homo sapiens 


Human DNA sequence from clone 
RP4-551D2 on chromosome 20ql3.2. 

v^oniams inc gene lor a caiuicrui- 
like protein VR20, a novel gene, the 
rVr 1 KO gcne lor proicui pnospnauiic i 
regulatory subunit 6, the 5' end of the 

o I y^Vd- gCIlC lUI ajrluipiUUdUai 

coniplcx protein 2, ESTs, STSs, GSSs 
and two putative CpG islands, conq>lete 
sequence. 






961 


AAU18881 


Homo sapiens 


HUMA- Novel prostate gland antigen, 
ScqIDNolSO. 


652 


100 


961 


AANf96033 


Homo sapiens 


HUMA- Human reproductive system 
related antigen SEQ ID NO; 4691 . 


652 


100 


962 


gi9622236 


Homo sapiens 


cadherin-like protein VR20 mRNA, 
partial cds. 


1235 


92 


962 


gi 12743872 


Homo sapiens 


Human DNA sequence from clone 
KJr4-jj luz on cnronjusomc iuqi j.^- 
13.33 Contains the gene for a cadherin- 

IiIta MmtAin \m'yi\ a nm/pt ot^nR flip 

iiice {hoiciu vivziy, d uovci gcuc, uic 
PPP I R6 gene for protein phosphatase 1 
regulatory subunit 6, the 5' end of the 
S YCP2 gene for synaptonemal 
con^lex protein 2, ESTs, STSs. GSSs 
and two putative CpG islands, con^)lete 
sequence. 






962 


AAB47329 


Homo sapiens 


CURA-FCTR6. 


1091 


84 


963 


gi9622236 


Homo sapiens 


cadherin-like protein VR20 mRKA, 
partial cds. 


1264 


100 


963 


gil2743872 


Homo sapiens 


Human DNA sequence from clone 
RP4-551D2 on chromosome 20ql3.2- 
13.33 Contains the gene for a cadherin- 
like protein VR20, a novel gene, the 
PPP1R6 gene for protein phosphatase 1 
regulatory subunit 6, the 5' end of the 


1264 


100 
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Descri ntion 


Score 


% 
Identity 








SYPP2 9ene for svnantonenial 
con^lex protein 2. ESTs, STSs, GSSs 
and two putative CpG islands, complete 
sequence. 








/vrvo*t 1 j£,y 




CURA- FCTR6 


1085 


84 


964 


AAY60064 


Homo sapiens 


META- Human endometrium tumour 
EST encoded protein 124. 


330 


98 








mRTCA for RGPR-nl 17 comnlete cds 


807 


79 


965 


gil4318616 


Homo sapiens 


clone MGC:17455 IMAGE:3448742. 
mRNA, complete cds. 


807 


79 






nomo sapiens 


VJHOl liLLIIUlIl SCwIClCU piULCUl, Ol^\^ viJ 

NO: 6464. 




96 






Homo sapiens 


itjiLri- xiuman proiein sequence ^cv^ 
ID NO: 16066. 




QQ 

yy 






Homo sapiens 


HBLI- Human stomach cancer 
expressed polypeptide SEQ ID NO 
149. 


Qin 

7 lU 


QO 




nil A71 QS#^7 


Homo sapiens 


cnronic myelogenous leuKcnua luinor 
antigen 66 mRNA, complete cds, 
aiicrnaiivciy spiiccu. 




QQ 
yy 


967 


AAG02669 


Homo sapiens 


GEST Human secreted protein, SEQ ID 


349 


100 


968 


gi 10438452 


HonK> sapiens 


cDNA: FLJ22170 fis, clone 

mvv^v/u u J £• . 


2870 


100 


968 


AAB41640 


Homo sapiens 


CURA- Human ORFX ORF1404 

nftim MM it iHi* ei»fiiti*nf^f* ll'l 

pujypcpuuc acijucii^w lu 

NO:2808. 


2037 


100 


968 


gi 15928410 


Mus musculus 


Similar to hypodietical protein 

FT 17^170 


1880 


69 


970 


gil 0440460 


Homo sapiens 


mRNA for FLJ00066 protein, partial 

cds. 


655 


99 




glf J 1 / 1 


tha liana 


cn/opin-iiKc. uaiiipubon piuicui 


91 


30 






AlaOiaopSlS 

thaliana 


piuioaennai lacior i 


Ql 




971 


gil5930209 


Homo sapiens 


hypothetical protein FIJ22477, clone 

Mnr-0^^7 TMAnP-'^01777A mRNA 

complete cds. 


882 


100 


7/1 


gl 1 vH^OOOX 


numD Sapiens 


f-niS]A< FT T774177 fic rlnn^ 
vi^rN/\. rkjj^^'if 1 us, ciDiic 

HRC10815. 




100 


Q7I 




mticr<i*liic 
IVlUo IllUdCUJUb 


pUutltVC 




76 


972 


AAB94173 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 14480. 


1542 


100 


972 


eil5215287 


Homo saniens 


hvDothetical orotein FLJ12610. clone 
MGC: 15029 IMAGE:4026495, mRNA. 
complete cds. 


1542 


100 


972 


gilO434201 


Homo sapiens 


cDNA FU12610 fis, clone 
NT2RM4001565. 


1542 


100 


973 


AAB94173 


Homo sapiens 


HELI- Human protein sequence S£Q 
ID NO: 14480. 


1419 


93 


973 


gil5215287 


Homo sapiens 


hypothetical protein FLJ12610, clone 
MGC: 15029 IMAG£:4026495, mRNA, 
complete cds. 


1419 


93 
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Score 


% 
Identity 


973 


gil0434201 


Homo sapiens 


cDNA FU12610fis, clone 
NT2RM400156S. 


1419 


93 


974 


AAY4I352 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 45 clone HTXFH55. 


300 


100 


974 


gil5029737 


Mus musculus 


complement component 2 (within H* 

2S) 


67 


58 


974 


gil92435 


Mus nuisculus 


complement component C2 


67 


58 


075 




Homo sa.niens 


HFT T- Human orotein seouence SEO 
IDNO:17623. 


721 


100 


975 




Homo fifloien^ 


cDN A FLJ 1 3 1 62 fis clone 
NT2RP3003625. 


721 


100 


975 


gi7302554 


Drosophila 
melanogaster 


CO 15094 gene product 


79 


33 


976 


AAY65192 


Homo sapiens 


GEST Human 5' EST related 
polypeptide SEQ ID NO: 1353. 


206 


100 


977 


AAG00539 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4620. 


420 


93 


977 


gi7243272 


Homo sapiens 


mRNA for KIAA1437 protein, partial 
cds. 


199 


55 


977 


gi5824508 


Caenorhabditis 
elegans 


contains similarity to P£am domain: 
PF00018 (SH3 domain), Score=15.4, 
E-value-0.00062, N^l-^DNA EST 
yk300d7.3 comes from this 
gene-cDNA EST yk300d7,5 comes 
from this gcne-cDNA EST yk3 lOdlO.3 
conries irora this gene^UNA bSl 
yk310d.l0.5 comes from this 
gene— ciJiNA cat yicjjja4.j comes 
from this gene-cDNA EST yk553a4.5 
comes from this gene— cDNA EST 
yk622f8.3 comes from this 

from this gene-cDNA EST yk674e4.3 

yk674e4.5 comes from this gene 


68 


33 


97S 


AAM41583 




HY^F* Human nolvnenttde SFO ID 
NO 6514. 


620 


100 


978 


AAND9797 


Homo sapiens 


HYSE- Human polypeptide SEQ ID 
NO 2942. 


620 


100 


978 


AAG04036 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 81 17. 


607 


99 


079 


AAY65244 


T4nfnn cnnipnc 


fiFST Human FST related 
polypeptide SEQ ID NO:1405. 


207 


100 


979 


AAG00117 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4198. 


11 


55 


979 


gil6878268 


Homo sapiens 


Similar to apolipoprotein L, clone 
MGC:29731 IMAGE:4661222. mRNA, 
complete cds. 


11 


55 


980 


AAG02124 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6205. 


334 


100 


980 


gi4469399 


Mus musculus 


epithelial sodium channel alpha subunit 


69 


37 


980 


gi2148928 


Rattus 
norvegicus 


epithelial sodium channel alpha subunit 


69 


37 



184 
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>\ccession no« 


Species 


I^C9Vi l|f UUll 




% 
Identity 


^^o 1 


gl / /Oo / / J 


iiomo SapiCIlS 


section 97/105. 


1065 


99 


701 


gllZf 7\l f 0 


oaccuarurnyQc 

s cerevisiae 


■ ifilrnAU/n 
unptnpwii 


135 


40 


701 




Oal'duil Uilljr 

s cerevisiae 


nRF vni O'iRc 


135 


40 


Oft*) 


AAljD/j7J 


Hoxno S3piens 


Ql I^^T? A mm/u ccBm«0>n^n at tinman 

oUVjH" /vnizno aciu sci|ucncc oi nunian 
protein kinase SGK258. 


I UO f 




VoZ 


A ACAA^^O 


Homo sapiens 


nujyL/\~ nunian proiciii lyiosinc Kinase 
receptor (PTK) from clone HDPSB68. 




QQ 

yy 




gll4017/y / 


Homo sapiens 


niKXM A lor KiAA 1 /yu proieui, parnai 
cds* 


lo/y 


yy 


983 


gil0440430 


Homo sapiens 


mRNA for FLJ00050 protein, partial 
cds. 


1433 


100 


983 


AAY84901 


Homo sapiens 


INCY- A human proliferation and 

opvpiUaLd IvIalCU piOlC'lU. 


258 


40 


983 


gil2053225 


Homo sapiens 


mRNA; cDNA DKFZp434P2235 (from 
clone DKFZp434P2235); complete cds. 


257 


40 


Q9A 

Vo4 




Homo sapiens 


uco 1 fiuman secreiea proiein, 
NO: 7332. 










Mus musculus 


putative 


QQ 


Oj 


984 


gi7296664 


Drosophila 
melanogaster 


CG10981 gene product 


68 


34 


985 


AAY12780 


Homo sapiens 


GEST Human 5' EST secreted protein 


203 


100 


985 


gil3879614 


Mycobacteriu 
m tuberculosis 


PE_PGRS family protein 


111 


43 


985 


gi9954108 


Trypanosoma 
cruzi 


RNA binding protein RGGm 


104 


38 


986 


AAG67032 


Homo sapiens 


SHAN- Himian endothelial monocyte 

activating polypeptide 11-62. 


2496 


99 


986 


gil0438461 


Homo sapiens 


cDNA: FU22l75fis. clone 

xlK\^UU//3. 


1186 


100 


986 


gi 14250826 


Homo sapiens 


hypothetical protein FLJ22 1 75, clone 

JVlVjCl'*yjJ UVlAVJlS.^JUloZo, mKJNA, 

complete cds. 


1171 


99 






fiomo Sapiens 


ncLri- muiian proiein sequence ocv^ 
ID NO: 14591. 


1097 


inn 


987 


AAB56999 


Homo sapiens 


ROSE/ Human prostate cancer antigen 
protein sequence SEQ CD NO: 1577, 


1027 


100 




gi 1 W^nZy / 


Homo sapiens 


CUlNA rLJiZOOD tis, clone 
NT2RM4002256. 


lUZ / 


inn 


700 


AAG02612 




GF^T Human cer^reted nratein SEO ID 
NO: 7693. 


302 


100 


988 


gil5981929 


Yersinia pestis 


putative iron ABC transporter, ATP- 
binding protein 


64 


32 


988 


gil6124148 


Yersinia 
pestis] > 
[Yersinia 
pestis 


putative iron ABC transporter, ATP- 
binding protein 


64 


32 


989 


AAG03478 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7559. 


190 


83 



185 
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Table! 



S£Q ID NU: 


Accession No. 






Score 


% 
Identity 


989 


—Ml 1 A/) 1 1 fi 


I^nrrtft canton c 


DKFZti434M1123 1 - human 
(fragment) > 


63 


36 


yyu 






HI IMA- Human colon cancer anticen 
nrotein SEO ID NO*4285 


406 


98 


990 


AAY00280 


Homo sapiens 


HUMA- Human secreted protein 
encoded by gene 23. 


359 


98 


991. 




Homo sapiens 


ryiov^" nuiTuiii piuicui &ir ivjvv* 


339 


100 


991 


gi9954173 


Homo sapiens 


DNA polymerase delta smallest subunit 
pi 2 (FOLDS) mRNA, complete cds. 


339 


100 


991 


gil2845953 


Mus nuisculus 


putative 


288 


84 


992 


AAW89035 


Homo sapiens 


HUMA- Polypeptide fragment encoded 
by gene 171. 


159 


100 


992 


gi5852085 


Oryza sativa 


zwhOOOS.l 


93 


27 


992 


AAB64815 


Homo sapiens 


HUMA- Human secreted protein 
sequence encoded by gene 43 SEQ ID 
NO:101. 


87 


30 


993 


gil47 17079 


Homo sapiens 


Human DNA sequence from clone 
RP3-469A13 on chromosome 20 
Contams part of the gene for 
KIAA0889 and a novel protein similar 

4-^ VI A AAOA1 mm'i.aI mama *V^t% C 

to KJAAOoUz, a novel gene, tiie d end 
of the part of the gene for a novel 
protein similar to N-myc downstream 
regulated (NDRGl), ESTs, STSs, GSSs 
and four CpG islands, conf9>lete 
sequence. 


1365 


99 


993 


AAB94598 


Homo sapiens 


HELI- Human protein sequence SEQ 
IDNO:154I8. 


1055 


96 


993 


gil0435333 


Homo sapiens 


cDNAFU13346fis, clone 

V A D r* 1 ft All 07 


1055 


96 


994 


AAG02845 


Homo sapiens 


GEST Human secreted protein, SEQ ID 

INU. 07^0. 


273 


100 


995 


AAM93342 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 

rs\j. Zoo J. 


283 


60 


995 


gi9279975 


Homo sapiens 


mRNA for Reprimo, complete cds. 


283 


60 


995 


gil 28041 11 


Homo sapiens 


canaidate mediator ot tne p j^- 
dependent G2 arrest, clone 

complete cds. 


^oj 


ou 


995 




Bacillus 
subtilis 


ynzD 


79 




996 


giyoUx341 


Arabidopsis 
thaliana 


PI 71 01 7"^ 


74 


24 


996 


gi7303166 


Drosophila 

melanogaster 


CG12864 gene product 


74 


33 


997 


ABB12196 


Homo sapiens 


HYSE- Human secreted protein 
homologue, SEQ ID NO:2566. 


424 


98 


997 


AAG03905 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 7986. 


173 


59 


997 


gil4043862 


Homo sapiens 


clone MGC:14138 IMAGE:3948518. 
mRNA, complete cds. 


173 


59 


998 


AAM78349 


Homo sapiens 


HYSE- Human protein SEQ ID NO 


72 


42 



186 
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Score 


% 
fucnuty 








lOU. 






998 


AAM79333 


Homo sapiens 


fiYoc- numan pToieui ocv^ Wj pJU 
2979. 


I\ 


Hi 


998 


gi 150426 11 


Homo sapiens 


Ser/Thr protein kinase PAR-lBalpha 
mRNA, complete cds. 


71 


A 1 

41 


999 


gil6550716 


Homo sapiens 


cDNA FLJ31318 lis, clone 
LI VERl 000433, moderately similar to 
Homo sapiens mRNA for neuropathy 
target esterase. 


2201 


100 


999 


AAM23450 


Homo sapiens 


HYSE- Human protein sequence SEQ 
IDNO:971. 


1423 


inA 
lUU 


999 




Homo sapiens 


INCY- Human cyclic nucleotide- 
associated protein-2 (CNAP-2). 


1422 


03 


1000 


gi7293162 


Drosophita 
melanogaster 


CGI 5603 gene product 


71 


46 


1000 


gi45 86294 


Rhodococcus 
sp. CIR2 


transposase 


71 


4/ 


1000 


gi7300412 


Drosophila 
melanogaster 


CO 14304 gene product 


69 


41 


1001 


AAY07759 


Homo sapiens 


TTT Tk ^ k. TT a J ^ * 

HUM A- Human secreted protem 
fragment encoded from gene 16. 


793 


88 


1001 


gil4603397 


Homo sapiens 


mitochondrial hbosomal protein S28, 
clone MGC:19500 IMAGE:4331 173, 
niRNA, conq}lete cds. 


787 


86 


1001 


gi44S4702 


Homo sapiens 


HSPC007 


787 


86 


1002 


gi 165499 18 


Homo sapiens 


cDNA FIJ30671 lis, clone 
FCBBF1000687, moderately similar to 
Mus musculus Rap2 interacting protein 
8 (RPIPS) mRNA. 


1527 


95 


1002 


AAB42726 


Homo sapiens 


CURA- Human ORFX ORF2490 
polypeptide sequence SEQ ID 
NG:4980. 


1314 


98 


1002 


gi258S624 


Homo sapiens 


BAG clone CTB*60N22 from 7q21, 
complete sequence. 


1314 


98 


1003 


gil0439234 


Homo sapiens 


cDNA: FLJ22659 fis, clone HSI07953. 


756 


100 


1003 


A A X M^t\ \^A 


Homo sapiens 


MOLE- Human bone marrow 
expressed probe encoded protein SEQ 
iu MU: JuHiO. 


157 


96 


1003 


gil4027542 


Mesorhizobiu 
m loti 


hypodietical protein 


72 


32 


1004 


AAG62909 


Homo sapiens 


KLEE/ Amino acid sequence of a 
human xylosylytiansferase (XT). 


3614 


99 


1004 


gil 1322268 


Homo sapiens 


partial mRNA for xylosyltransferase I 


3614 


99 


1004 


gil5209651 


Homo sapiens 


human XT-I (not completely) 


3614 


99 


1005 


AAG02478 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 6559. 


380 


100 


1005 


AAY86496 


Homo sapiens 


HUMA- Human gene 61 -encoded 
protein fragment, SEQ ID N0:4 1 1 . 


69 


35 


1005 


AAY86324 


Homo sapiens 


HUMA- Human secreted protein 

HSRGW16, SEQ ID N0:239. 


69 


35 


1006 


AAB90708 


Homo sapiens 


GEMY Human CJ397_1 protein 
sequence SEQ ID 109. 


241 


100 



187 



wo 02/074961 



PCT/US02/05109 



Table 2 



SEQIDNO: 


Accession No. 


Species 


Description 
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% 
Identity 


1006 


AAW48809 


Homo sapiens 


GEMY Homo sapiens clone CJ397_1 
protein. 


241 


100 


lUUO 


glO7l030 


Sorghum 
bicolor 


oamma.lrsifinn nrmiYitein 


83 


32 


lOU/ 


AAYjyool 


nomo sapiens 


B7.FL 


408 


100 


1U07 


mil mini's 


nomo sapiens 


Human hptsi.l il 

acetylgalactosaminyltrans ferase 
mRNA. complete cds. 


65 


45 


100/ 




ouepioniyccs 
coelicolor 


mitativi* intporal meinbraiie nratein. 


65 


45 


1008 


AAG73798 


Homo sapiens 


HUMA- Human colon cancer antigen 
orntein SEO ID NO*4562 


653 


98 


1008 


AAG03987 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO- 8068 


653 


98 


1008 


AAB54311 


Homo sapiens 


HUMA- Human pancreatic cancer 
NO:763. 


653 


98 


1009 


AAB24198 


Homo sapiens 


HONJ/ Human activation-induced 
cvtidine deaminase SEO ID NO' 8. 


1086 


100 




oJQQOJMI A 

giyyoo*! iu 


flumo SapiCUa 


AID iriRTJA fnr activation-induced 
cytidine deaminase, con^lete CDS. 


1086 


100 




gl77DOtU0 


f4Amr% canipnc 


AID pene for activation-induced 
cytidine deaminase, complete cds. 


1086 


100 


1010 


giI0439796 


Homo sapiens 


cDNA: FLJ23189 fis, clone 
LNG12061. 


• 1172 


100 


IA1 A 




numo Sapiens 


MOT P- Human hone marrow 

expressed probe encoded protein SEQ 
ID NO- 30762 


467 


100 


1010 


Ei2627231 


Bostaunis 


NDP52 


101 


28 


lAl 1 

Wl 1 






cDNA- FIJ21858 fis clone HEP02301 


744 


97 


lOil 


AAG66887 


Homo sapiens 


SHAN- Human zinc finger protein 1 7. 


156 


30 


1011 


gil6S53140 


Homo sapiens 


cDNA FU32873 fis, clone 
TESTI2003998. weakly similar to T- 
CELL RECEPTOR BETA CHAIN 
ANA 11. 


146 


38 


1012 


AAG03653 


Homo sapiens 


GEST Human secreted protein, SEQ ID 

NO: 7734. 


425 


100 


1012 


AAU19393 


Homo sapiens 


PHAA Human G protein-coupled 
recepior norv^tv-zjxo. 


87 


36 


1013 


gi6180179 


Homo sapiens 


transcription factor IGHM enhancer 3, 
JiMi 1 proicUt, JiYi** proicui, Jivij 
protein, T54 protein, JMIO protein, A4 
differentiation-dependent protein, triple 
LIM domain protein 6, and 
synaptophysin genes, complete cds; 
and L-type calcium channel alpha- 1 
subunit gene, partial cds, conplete 
sequence. 


3632 


99 


1013 


gil42S0618 


Homo sapiens 


clone MGC:2962 IMAGE:3139519, 
mRNA, complete cds. 


3077 


94 


1013 


fii7242943 


Homo sapiens 


mRNA for iaAA1294 protein, partial 


297 


32 



188 
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% 

Tr1i>nfif-v 
AUClliltj 








^Ua« 








/V/\ I UDVfXrr 


nuilHj aa|iidta 


CtF^T Human P^T r^lflterl 

polypeptide SEQ ID NO: 1 165. 




7Q 






elegans 




uo 






oi 7407774 


^acUUlijaUU Ilia 

elegans 


iiypuuicaval proicui ^jurH.f 
Caenorhabditis elegans > 


Do 


JO 




A AVf«n^7fi 
A/\ I /o 


tioniu Sapiens 


jvic 1 /K" numan norniai uiauacr ussue 
EST encoded protein 250. 


477 
Hit 


inn 


lUlO 


nil OQyIOl 1 A 
gll Zo4Vi iO 


Mus musculus 


putative 




7A 
/O 


1016 


AAB50970 


Homo sapiens 


GETH Human PRO4302 protein. 


306 


35 


lUlO 


A AT f 1 'iAA^ 


Homo sapiens 


Oc 1 n Human rKU4iU^ poiypeptiae 

sequence. 


3Uo 




lUl / 


giZi 13745 


Helicobacter 
pyion zoo 7 J 


H. pylori predicted coding region 

nrUO I** 


"71 




1017 


gil0038760 


Buchnera sp. 
APS 


flagellar assembly protein fliH 


11 


32 


1 An 
lUl / 


n^l CI ylAAOA 


lumpy skin 
disease virus 


ifoUVU/y mKNA cappmg enzyme 
large subimit 


fit. 
DO 




1 m o 
lOlO 


gi99672o9 


Macaca 
fascicularis 


hypothetical protein 


3jo 


91 




A A /^A^A'*4C 


Homo sapiens 


Obb 1 Human secreted protein, bcQ 11^ 

■Mrv. 71 AT 


243 


iUU 


1019 


gil2853136 


Mas musculus 


putative 


166 


67 




A A DA 1 ^OC 


Homo sapiens 


UUKA- Human UKr A UKr liwy 
polypeptide sequence SEQ ID 


04 


J4 


1020 


AAY36512 


Homo sapiens 


HUM A- Fragment of human secreted 
protein encoded by gene 32. 


748 


100 




gi/z43i /y 


Homo sapiens 


mKjN A tor isJAA 1 jyy protein, partial 
cds. 




A 1 


1020 


gi7243179 


Homo sapiens 


KIAA1399 protein 


82 


41 


1 Al 1 

1021 


AAB95621 


Homo sapiens 


HELI- Human protein sequence SEQ 

TF\ xr/^. 1 0 1 1 D 
iV NU:1533o. 


O ACO 

2058 


AA 

99 


m*} 1 
IvZi 


glIU4J0Z/^ 


_ ; 

Homo sapiens 


cJJNA tLJljyoo IIS, Clone 
Y79AA1001216. 




yy 






Homo sapiens 


nypotneticai prorcin rLJi^43o, clone 
con^lete cds. 


zUjO 


yy 


1022 


gi5 14268 


Homo sapiens 


Human proto-oncogcne tyrosine- 
proicin jLiiLase \j\oL,j gene, exon la 
and exons 2-10, complete cds. 


248 


100 




J Jo /u 


ivLus muscuius 


c-aoi proiein, type iv 




yj 


1022 


gi49841 


Mus musculus 


c>aV>I nrotein 


242 


95 


1023 


AAG66758 


Homo sapiens 


BIOW- Human promoter binding factor 
13. 


627 


100 


1023 


gi9963908 


Homo sapiens 


NPD009 mRNA, complete cds. 


627 


100 


1023 


gil4290450 


Homo sapiens 


NPD009 protein, clone MGC: 16898 
IMAG£:4156159, ihRNA, complete 

cds. 


624 


99 


1024 


gill 138042 


Homo sapiens 


mRNA, similar to rat myomegalin, 
complete cds. 


1227 


99 


1024 


AAY00346 


Homo sapiens 


HUMA- Fragment of human secreted 


1206 


97 



189 
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Species 
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% 
Identity 








protein encoded by gene 2. 






1024 


AAM25852 


Homo sapiens 


HYSE- Human protein sequence SEQ 
ID NO: 1367. 


1199 


96 


1025 


AAG00700 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 4781. 


393 


98 


1025 


gi 12858787 


Mus musculus 


putative 


313 


93 


1025 


gil6553210 


Homo sapiens 


cDNAFU32921 fis, clone 
TESTI2006872. 


209 


70 


1026 


gil6924223 


Homo sapiens 


hypothetical protein FLJ12929, clone 
MGC:22200 IMAGE:4070101, mRNA, 
complete cds. 


682 


100 


1026 


AAB95241 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 17394. 


673 


99 


1026 


gil0434702 


Homo sapiens 


cDNA FU12929 fis, clone 
NT2RP2004775. 


673 


99 


1027 


gil4017897 


Homo sapiens 


mRNA for KIAA1840 protein, partial 

cds. 


2216 


100 


1027 




Homo saniens 


cDNA: FLJ21439 fis, clone 
COL04352. 


2210 


99 




AAG81395 


Homo saoiens 


ZYMO Human AFP protein sequence 

SEQ ID NO:308. 


1308 


100 


102K 


ai3 127176 


Homo saniens 


sulfonylurea receptor 2B (SUR2) gene, 
alternatively spliced product, exon 38b 
and complete cds. 


723 


100 


1028 


gi3l27175 


Homo sapiens 


sulfonylurea receptor 2A (SUR2) gene, 
alternatively spliced product, exon 38a 
and complete cds. 


723 


100 


1028 


gi 15778680 


Oryctolagus 
cuniculus 


sulphonylurea receptor 2B 


710 


98 


1029 


gil4333990 


Homo sapiens 


enhancer of polycomb 2 (EPC2) 
mRNA, complete cds. 


3911 


99 


1029 


gill 907923 


Homo sapiens 


enhancer of polycomb mRNA, 
complete cds. 


3879 


97 


1029 


gi3757892 


Mus musculus 


enhancer of polycomb 


3613 


92 


1030 


gi9967305 


Macaca 
fascicularis 


hypothetical protein 


313 


94 


1030 


AAM80165 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
3811. 


76 


43 


1030 


AAM79181 


Homo sapiens 


HYSE- Human protein SEQ ID NO 
1843. 


76 


43 


1031 


AAM938i3 


Homo sapiens 


HELI- Human polypfeptide, SEQ ID 

NO: 3861. 


346 


95 


1031 


AAG01877 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5958. 


341 


100 


1031 


gi5917666 


Zeamays 


extensin-like protein 


67 


53 


1032 


AAM93813 


Homo sapiens 


HELI- Human polypeptide, SEQ ID 
NO: 3861. 


341 


100 


1032 


AAG01877 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO: 5958. 


341 


100 


1032 


gil0799949 


Ramis 

norvegicus 


ABC2 


72 


36 


1033 


AAY 19473 


Homo sapiens 


HUMA- Amino acid sequence of a 
human secreted protein. 


264 


100 



190 
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Species 


Description 


Score 


% 
Identity 


1034 


gil 7390437 


Homo sapiens 


clone MGC:9829 IMAGE:3863 1 1 8. 
mRNA, complete cds. 


879 


99 


1034 






Diitative 


777 


85 


1034 


gil0440154 


Homo sapiens 


cDNA: FLJ23459 fis. clone HS107588. 


758 


100 


1035 


AAR97285 


Homo sapiens 


KYOW Human 26S proteasome 

^nn cri ft itivi* nmfpin P^l 


1331 


100 






nuuKi Sapiens 






100 


1035 


gil2654653 


Homo sapiens 


proteasome (prosome, macropain) 26S 
MGC:1660 IMAGE:3528096, mRNA, 

POTrwilpfp f^Hc 


1331 


100 


1036 


gil26S4125 


Homo sapiens 


hypothetical protein PP5395, clone 
MGC-5610 TMAGE-3461724 mRNA 
complete cds. 


766 


99 




I'll Vr*rl7UO 


llvrlllU dO|ilwlld 




766 


99 


1036 


gil2843917 


Mus musculus 


putative 


535 


73 


inn 


AAlJuZ/CW 


Homo sapiens 


/^CCT* 14iimon ct^nrt^ie^ rtmtfiin QT«0 TFl 

\jco 1 nunian secreieu proicin, ucv< uli 
NO: 6845. 


7R1 

ZO i 


100 


inn 


gl 1 / ZZ 1 3*f4 


IV lu yvcrijiny cc 

s lactis 


nypouicuLtai piuiciii 


O f 


35 


11/ J I 


gl 1 DO'l3rlr* 1 


/MaUlUOpSla 


UlUUlUWIi piUlCill 




37 


1038 


AAO02417 


Homo sapiens 


HYSE- Human polypeptide S£Q ID 
NO 16109 


445 


96 


1038 


AAG03101 


Homo sapiens 


GEST Human secreted protein, SEQ ID 
NO* 7182 


385 


97 


1038 




Helnhdella 
robiista 


L0X5 


73 


34 


1039 


AAE09718 


Homo sapiens 


MILL- Human ubiquitin carboxy- 

fprminal hvHmla^e 2')4 ^(6 nratein 


571 


100 


1039 


gil6547646 


Homo sapiens 


unnamed protein product 


571 


100 


1039 


AAB74684 


Homo sapiens 


INCY- Human protease and protease 
inhibitor PPIM-17. 


561 


100 




A AMO^fi^/^ 
/1/\XYIZJoOO 


nomo sapiens 


U^/^QC lTiivM>ift nmf'0in co/iKon/^J* ^Wll 

xi I oJCi- xiuman pioicm. sc({ueucc oci/ 
IDNO:138l. 


Oil 


100 




glil/HHUlDo 


numo Sapiens 


rDMA* FT M'KAfsSL fic r1nnf> H^TI \fi(S% 
^mJiSi\, n^z.7Hoo lis, wiunc noii iw^. 


Oil 


100 


1040 


gil2839602 


Mus nxusculus 


putative 


573 


65 


1U41 


A AD^m IB 


nomo sapiens 


iiNL. X - jiuman uanspon prorcm i rr i - 


IZJU 


inn 


1041 


gil6552638 


Homo sapiens 


cDNA FU32499 fis, clone 
SKNSH2000347. weakly similar to 

(EC 1.1.2.3). 


842 


98 


1041 


gi9801259 


Leishmania 
major 


possible CGI 5429 protein 


449 


44 


1042 


AAB94782 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 15884, 


330 


70 


1042 


AAU27665 


Homo sapiens 


ZYMO Human protein AFP 162878. 


330 


70 


1042 


gil 5215279 


Homo sapiens 


hypothetical protein MGCl 1349, clone 
MGC: 14984 IMAGE:3635966. mRNA. 
complete cds. 


330 


70 


1043 


gil0439613 


Homo sapiens 


cDNA: FU23047 fis. clone 


668 


99 
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bE%i W iNU: 


' — ; — r 

Accession No. 




nMcriotion 


Score 


% 
Identity 








LNG02513. 






1043 


gU2550U3U 


ivius niuscuius 


puiauvc 


340 


53 


1043 


gil3622152 


Streptococcus 
pyogenes Ml 
GAS 


hypothetical protein 


88 


29 


1044 


AAB94493 


Homo sapiens 


PTPT r. TTiiTTian nrntPin QeaiiPfiPe SPO 

ITiZ.l^l'** nLLlIiali piUlvUl av^UbllL^w 

IDNO:15184. 


193 


90 


1044 


gllo3U/iol 


^4us musculus 


Similar tf\ Hvnamin ? 


191 


88 


1044 


gl 12853743 


Mus musculus 


pUiaUVC 


191 


88 


1045 


AAM25873 


Homo sapiens 


HYSE- Human protein sequence SEQ 


516 


100 


1045 


AAY57878 


Homo sapiens 


INCY- Human transmembrane protein 


516 


100 


1045 


AAU390O9 


Homo sapiens 


GEMY Human secreted protein 


80 


30 


1046 


A A /^A1 At A 

AAG03414 


Homo sapiens 


r^CCT Uiimon cAf*rptpH nrr\t<»in ^t^O IIJ 

VjCiO 1 nurnaii bcvivicu pruiciiiy ocv^ il/ 
NO: 7495, 


328 


100 


1046 


gi230152o 


unidentified 


AXifVT run PPOTPTM a a 


100 


29 


1046 


gi 160229 


Plasmodium 
reichenowi 


circumsporozoite protein 


95 


30 


1047 


gil6550135 


Homo sapiens 


cDNA FU30851 fis. clone 
FEBRA2002908. 


840 


100 


1047 


gi9967240 


Macaca 
fascicularis 


nypotneiicai proiein 




71 


1047 


gl 12853386 


Mus musculus 


putative 




46 


1048 


gi3746069 


Arabidopsis 
thaliana 


putative non-LTR retroelement reverse 
transcriptase 


74 


31 


1048 


gi7271069 


Candida 
albicans 


hypothetical protein 


71 


36 


1048 


gil 3882111 


Mycobacteriu 
m tuberculosis 
CDC1551 


P£ femily protein 


70 


34 


1049 


gi9947823 


Pseudomonas 
aeruginosa 


hypothetical protein 


643 


70 


1049 


gl 17429445 


Raistonia 
solanacearum 


PROTEIN 


365 


56 


1049 


gi9950333 


Pseudomonas 
aeruginosa 


hypothetical protein 


321 


47 


1050 


AAY27754 


Homo sapiens 


UT A UiifYion c^r^rAt'^/l nmf^m 

nuiN^^'* numan sccreicu pioiciii 
encoded by gene No. 38. 


555 


100 


1050 


gi2l 04464 


Schizosacchar 
omyces pombe 


nypoineucai proiein 


71 


29 


1050 


gi3287941 


Schizosacchar 
omyces pombe 


C25H2.15 IN CHROMOSOME II > 


71 


29 


1051 


AAB95246 


Homo sapiens 


HELI- Human protein sequence SEQ 
ID NO: 17407. 


757 


100 


1051 


AAB95127 


Homo sapiens 


HELI- Human protein sequence SEQ 

IDNO:17129. 


757 


100 


1051 


gil0434139 


Homo sapiens 


cDNA FU12572 fis, clone 
NT2RM4000971. 


757 


100 


1052 


AAB53a66 


Homo sapiens 


GETH Human angiogenesis-associated 
protein PR0178. SEQ ID N0:1 1. 


71 


64 


1052 


AAB51330 


Homo sapiens 


HERI- Human NEW aneiopoietin-like 


71 


64 
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Accession No. 


Species 


Description 


Score 


% 
Identity 








protein SEQ ID N0:8. 






10S2 


AAY72626 


Homo sapiens 


HYSE- Human angiopoietin protein, 
CG0I5alt2. 


71 


64 
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SEQID 
NO: 


Database 
entry ID 


Description 


^Results 


S88 


BL01183 


ubiE/C0Q5 methyltransferase family 
proteins. 


BL01183B 21.31 3.317c-ll 146-191 


629 


BL00223 


Annexins repeat proteins domain 
proteins. 


BLO0223A 15.59 4.4 14e-30 20-54 BL00223C 
24.79 1.186e-ll 7-62 


629 


PR00198 


ANNEXIN TYPE 11 SIGNATURE 


PR00198B 8.71 4.767e-13 29-52 PR00198D 
7.65 4.758e.I2 24-46 PR00198D 7.65 3.298e. 
11 96-1 18 


629 


PR00202 


ANNEXIN TYPE VI SIGNATURE 


PR002O2B 11,44 8.986e-19 28-52 PR00202C 
13.34 4.452e-16 69-86 PR00202D 5.58 5.182e- 
1196-118 


629 


PR00199 


ANNEXIN TYPE III SIGNATURE 


PR00199B6.86 1. 65 le- 16 29-52 PR00199D 
5.65 7.039e-l3 24^6 PR00199D 5.65 3.586e. 
10 96-118 PR00199C 13.84 7.152e-10 69-86 


629 


PR00197 


ANNEXIN TYPE I SIGNATURE 


PR00197D 7.50 8.125e-15 24-46 PR00197B 
7.56 9.143e-12 29-52 PR00197D 7.50 8.8 1 3e- 
10 96-118 


629 


PR00196 


ANNEXIN FAMILY SIGNATURE 


PR00196A ll.l6 3,700e-21 29-52 PR00196C 
10.36 3.298e-17 96-118 PR00196B 10.68 
7.750e-17 69-86 PR00196C 10.36 4.536e-14 
24-46 PR00196E 9.19 1.563e-09 28-49 


629 


PR00200 


ANNEXIN TYPE IV SIGNATURE 


PR00200B 7.39 5.919e-15 29-52 PR00200E. 
10.00 5.871e-13 24-46 PR00200E 10.00 
8.941e-13 96-118 PR00200D 10.01 9.471e-12 


629 


PR00201 


ANNEXIN TYPE V SIGNATURE 


PR00201A 6.05 l.OOOe-28 29-52 PR00201D 
10.49 3.250e-24 96-1 18 PR00201C 11.13 
1.474e-21 69-86 PR00201B 8.88 2.55 2e-l 1 53- 
62 PR00201D 10.49 7.198e-09 24-46 


795 


BL00572 


Glycosyl hydrolases family 1 proteins. 


BL00572C 20.73 2.324e-25 40-75 


938 


PD00210 


PROTEIN ANTIOXIDANT 
PEROXIDASE RED. 


PD00210 15.25 3.912e-09 88-104 


940 


PD00210 


PROTEIN ANTIOXIDANT 
PEROXIDASE RED. 


PD00210 15,25 5.500e-09 88-104 



* Results include in order: Accession No., subtype, e-value, and amino acid position of the signature in 
the corresponding polypeptide 
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S£Q 
ID 


Pfam Model 


Description 


£-value 


Score 


No: of 
Pfam 
Domain 
s 


Position of 
the Doniflin 


527 


ion trans 


Ion transport protein 


8.3e-18 


72.6 




438-672 


527 


ank 


Ankyrin repeat 


1.9e-06 


34.8 





77-108:163- 
195 


527 


Sre 


C.elegans Srg family integral 
membrane prot 


8.1 


-222.4 




418-669 


546 


vwa 


von Willebrand factor type A 
domain 


077 


-42.3 




776-958 


553 


TPR 


TPR Domain 


7.1 


7.4 




40-73 


566 


FTPS 


6-pyruvoyl tetrahydropterin 

a jr 11 1 1 lcli>& 


0.54 


-52.2 


J 


52-149 


575 


PolyA_pol 


Poly A polymerase family 


1.3 


-61.4 


Li 


37-155 




T TKtp rru^thvlfran 


uhiF7r*nf>^ methvlirancferase 
family 


0.6 


-150.7 




65-249 


591 


ubiquitin 


Ubiquitin femily 


0.15 


11.5 


a 


106-197 






Tnnoi<:ompr«me DNA hinHinp 
C4 zinc fins 


9.2 


-5.6 


-., 


96-130 


594 


zf-C2H2 


Zinc fineer C2H2 tvoe 


1.1 


15.8 




61-85 


599 


CBFD NFYB HMF 


Histone'like transcription 
factor 


3.8 


-8.3 




26-89 


610 


vwd 


von Willebrand factor type D 
domain 


7.7 


-30,1 




169-321 


610 


HRM 


Hormone receptor domain 


7.8 


-13.5 


T 


85-150 


612 


Metallophos 


Calcineurin-like 
phosphoesterase 


7.9 


-8.2 




18-177 


618 


Peptidase_C54 


Peptidase femily C54 


5.9e-207 


700.9 




42-332 


627 


AT hook 


AT hook motif 


8.5 


7.9 




97-109 


629 


anncxin 


Annexin 


7.6e-31 


115.9 




17-84 


631 


ABC-3 


ABC 3 transport family 


2.1 


-182.9 




152-349 




inn tmnc 


Inn tranQrinrt nmtein 


8.3 


-13.4 




187-389 


653 


LEA 


Late embryogenesis abundant 

nrntein 


8.2 


-6.8 




. 203-270 


655 


PMP22_Claudin 


PMP.22/EMP/MP20/Claudin 
family 


2.9 


-60.0 




8-159 


669 


CBM_21 


Putative phosphatase 

r^oulatnTv ciihiinit 


0.0056 


5.1 




280^18 


671 


V-ATPase C 


V-ATPase subunit C 


1.3e-54 


194.8 




1-225 




Tim 17 
1 mil t 


iviiL\i^uuiiUi lai iiuLyui I iiuivi 




7^0 7 




51-184 

J 1 — JlO*T 


678 


Timl7 


Mitochondrial import inner 
memhrane trancloc 


3.1e-57 


203.6 




51-234 


681 


PARP 


Poly(ADP-ribose) polymerase 
catalytic domain 


5.2 


-96.7 


1 ■ 


397-577 


692 


vATP-synt_E 


ATP synthase (E/31 kDa) 
subunit 


4.1 


-92.4 




276^59 


693 


vATP-synt_E 


ATP synthase (E/31 kDa) 
subunit 


4.1 


-92.4 




332-515 


709 


Ribosomal S25 


S25 ribosomal protein 


7.9C-44 


159.0 




1-113 


716 


DUF6 


Integral membrane protein 
DUF6 


3.1 


-16.3 




11-145 


717 


PAP2 


PAP2 superfamily 


1.7 


-22.6 


1 


174-355 
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ID 


Pfam Model 


Description 


E-value 


Score 


No: of 
Pfam 
Domain 
s 


Position of 
the Domain 


718 


IBR 


IBR domain 


6.2 


-14.4 




59-110 


732 


zf-C2H2 


Zinc finger. C2H2 type 


9.7 


9.4 


, - 


389-410 


745 


DUF81 


Domain of unknown function 
DUFSl 


4.7 


-44.7 




3-150 


751 


Glyco_hydro_2_N 


Glycosyl hydrolases family 2, 
sugar b 


0.44 


-75.0 




37-144 


761 


Myc-LZ 


Myc leucine zipper domain 


2.2 


12.8 


i 


136-168 


762 


Tropomyosin 


Tropomyosin 


5.2 


-116.0 


1 


318-529 


762 


LEM 


LEM domain 


10 


-4.0 


1 


461-504 


764 


SEA 


SEA domain 


0.076 


17.1 




112-245:270- 

395:561- 

684:955-1085 


769 


TTL 


Tubulin-tyrosine ligase family 


2.4e.93 


323.5 


■{ 


35-344 


780 


HEAT_PBS 


PBS lyase HEAT-like repeat 


0.17 


18.4 




298-323:390- 

422 


780 


Adaptin N 


Adaptin N temiinal region 


0.46 


-162.5 


1 


65-643 


780 


Dioxygenase 


Dioxygenase 


2.5 


-106.2 


1 


807-937 


785 


CENP-B 


CENP-B protein 


1.4e-07 


4.9 




178-367 


785 


HTHJ 


Bacterial regulatory protein, 
arsR family 


0.48 


5.3 


1 

_ 


9-93 


785 


HTH 3 


Helix-tum-helix 


1.4 


10.3 




20-74 


788 


zf-CCCH 


Zinc fmger C-x8-C-x5-C-x3- 
Htype 


1.4 


9.4 




76-103:116- 
144 


791 


Calx-beta 


Calx-beta domain 


0.0011 


16.7 


4 


82-160 


795 


Glyco hydro 1 


Glycosyl hydrolase family 1 


7.4e-07 


-205.2 




2-171 


807 


run 


RNA recognition motif. 


0.78 


3.4 




13-80 


807 


UIM 


Ubiquitin interaction motif 


3.4 


13.2 




650-667:673- 
690 


822 


Keratin_B2 


Keratin, high sulfiu^ B2 
protein 


0.15 


-55.4 


1 


2-161 


825 


Acetyltransf 


Acetyltransferase (GNAT) 
family 


5.7 


1.1 




191-277 


846 


Glyoxalase 


Glyoxalase/B leomycin 
resistance protein/Di 


0.074 


11.7 




2-118 


850 


MORN 


MORN repeat 


l.le-28 


108.7 




14-36:38- 

59:60-80:106- 

128:157- 

179:309- 

331:332-354 


863 


NIF 


NLI interacting factor 


1.6e-104 


360.6 


1 


82-256 


869 


Phosducin 


Phosducin 


0.0067 


-89.2 


1 


1-239 


870 


MotA^ExbB 


MotA/TolQ/ExbB proton 
channel family 


1.5 


-49.3 


1 


89-204 


872 


Annadillo_seg 


Armadillo/beta-catenin-Uke 
repeat 


0.42 


17.1 


2 


677-717:727- 
769 


872 


HEAT^PBS 


PBS lyase HEAT-like repeat 


6.1 


13.2 


3 


410-436:704- 
745:756-798 


872 


Adaptin N 


Adaptin N terminai region 


9.5 


-204.6 


1 


215-972 


885 


Patatin 


Patatin-likc phospholipase 


9.2C-30 


112.3 


1 


10-179 


890 


Glycophorin A 


Glycophorin A 


5.9 


-44.6 


1 


2-91 
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Description 


£-value 


Score 


No: of 
Pfam 
Domain 
s 


Position of 
the Domain 


901 


dehydrin 


Dehydrin 


6.6 


.77.8 




52-223 


931 


ubiquitin 


Ubiquitin family 


8.5 


-6.5 


1 


405-478 


938 


AhpC-TSA 


AhpC/TSA family 


5e-05 


3.6 


1 


58-189 


940 


AhpC-TSA 


AhpOTSA femily 


0.28 


-43.3 


] 


58-165 


943 


NUDIX 


MutT-like domain 


8.6e-07 


36.0 


1 


12-264 


949 


V-ATPase_G 


Vacuolar (H+)-ATPase G 
subunit 


5.8 


-48.6 


1 


10-120 


954 


C0X6C 


Cytochrome c oxidase subunit 
Vic 


1.2e-14 


62.1 


-j 


M7 


9S6 


Myc N term 


Myc amino-terminal region 


8.1 


-185.0 


1 


209-500 


961 


Cadherin C term 


Cadherin cytoplasmic region 


7.8 


-83.3 


1 


41-148 


962 


cadherin 


Cadherin domain 


0J7 


9.3 




19-109 


962 


Cadherin C term 


Cadherin cytoplasmic region 


0.89 


-70.9 




159-319 


963 


cadherin 


Cadherin domain 


0.17 


9.3 




19-109 


963 


Cadherin C term 


Cadherin cytoplasmic region 


0.21 


-62.6 




159-301 


992 


Keratin_B2 


Keratin, high sulfur B2 , 
protein 


5.8 


-80.3 


1 


28-166 


999 


Patatin 


Patatin- like phospholipase 


1.4e-54 


194.7 




30-196 


1001 


SI 


SI RNA binding domain 


1.3 


4.5 




67-131 


1004 


Branch 


Core-2/I-Branching enzyme 


0.00014 


-64.7 




3-317 


1029 


MurJigasc_C 


Mur ligase family, glutamate 
ligase doma 


7.9 


-11.9 




161-235 


1040 


SciyLtRNA_N 


Seryl-tRNA synthetase N- 
terminal domain 


6 


O.l 




56-102 


1041 


heme 1 


Heme/Steroid binding domain 


0.00024 


22.9 




19-98 
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PDB annotation 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
GABPALPHA; GABPBETAl; 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA). DNA- 
BINDING. 2 NUCLEAR 
PROTEIN, ETS DOMAIN. 
ANKYRIN REPEATS, 
TRANSCRIPTION 3 FACTOR 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
GABPALPHA; GABPBETAl; 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), DNA- 
BINDING. 2 NUCLEAR 
PROTEIN, ETS DOMAIN, 
ANKYRIN REPEATS. 
TRANSCRIPTION 3 FACTOR 


TUMOR SUPPRESSOR 
TUMOR SUPPRESSOR. 
CDK4/6 INHIBITOR, 
ANKYRIN MOTIF 


COMPLEX (KINASE/ANTI- 
ONCOGENE) CDK6; 
P16INK4A. MTSl; CYCLIN 
DEPENDENT KINASE, 
CYCLIN DEPENDENT 
KINASE INHIBITORY 2 
PROTEIN. CDK, INK4, CELL 
CYCLE, MULTIPLE TUMOR 
SUPPRESSOR. 3 MTSl, 
COMPLEX (KINASE/ANTI- 
ONCOGENE) HEADER 


COMPLEX (INHIBITOR 
PROTEIN/KINASE) 
INHIBITOR PROTEIN, 
CYCLIN-DEPENDENT 
KINASE. CELL CYCLE 2 
CONTROL, ALPHA/BETA, 


Compound 


GA BINDING PROTEIN 
ALPHA; CHAIN: A; GA 
BINDING PROTEIN BETA 
1; CHAIN: B; DNA; CHAIN; 
D, E; 


GA BINDING PROTEIN 
ALPHA; CHAIN: A; GA 
BINDING PROTEIN BETA 
1 ; CHAIN: B; DNA; CHAIN: 
D.E; 


P19INK4DCDK4/6 
INHIBITOR; CHAIN: NULL; 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
MULTIPLE TUMOR 
SUPPRESSOR; CHAIN: B; 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
PI9INK4D;CHAIN: B; 


SEQ FOLD 
score 












PMF 
score 


0.52 


0,29 


0.19 


0,04 


0,00 ' 


Verify 
score 


-0,23 


-0.09 


o 
9 


d 




Psi 
Blast 


1 

u 

OQ 


? 


& 


u 
tT 

vd 


VO 

<s 
vd 




r- 

tr> 




O 
v© 


o 


s 


START 
AA 


CO 


rn 


rs 


ro 




CHAIN 
ID 


m 








CQ 




lawc 


lawc 


1 


'x> 


K 


SEQ ID 
NO 


r- 
r4 




r- 


pj 
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PDB annotation 


COMPLEX (INHIBITOR 
PROTEIN/KINASE) 


COMPLEX (INHIBITOR 
PROTEIN/KINASE) 
INHIBITOR PROTEIN. 
CYCLIN-DEPENDENT 
KINASE. CELL CYCLE 2 
CONTROL, ALPHA/BETA, 
COMPLEX (INHIBITOR 
PROTEIN/KINASE) 


HORMONE/GROWTH 
FACTOR P18-INK4C; CELL 
CYCLE INHIBITOR, 
P18INK4C, TUMOR, 
SUPPRESSOR. CYCLIN- 2 
DEPENDENT KINASE. 
HORMONE/GROWTH 
FACTOR 


SIGNALING PROTEIN 
HELIX-TURN-HELIX. 
ANKYRIN REPEAT 


CELL CYCLE INHIBITOR 
P18-INK4C(INK6); CELL 
CYCLE INHIBITOR,? 18- 
INK4C(INK6), ANKYRIN | 
REPEAT. 2 CDK 4/6 
INHIBITOR 


CELL CYCLE INHIBITOR 
P18-INK4C(INK6);CELL ' 
CYCLE INHIBITOR. P18- 
INK4C(INK6). ANKYRIN 
REPEAT, 2 CDK 4/6 
INHIBITOR 


ANK-REPEAT I 
MYOTROPHIN, 

ACETYLATION.NMR.ANK- , 
REPEAT 




g 
? 

5 
a 
3 

CO 

< 

O 


Compound 




CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
P19INK4D; CHAIN: B; 


CYCLIN-DEPENDENT 
KINASE 6 INHIBITOR; 
CHAIN: A; 


CYCLIN-DEPENDENT 
KINASE 4 INHIBITOR B; 
CHAIN: A; 


CYCLIN-DEPENDENT 
KINASE 6 INHIBITOR; 
1 CHAIN: A, B; 


CYCLIN-DEPENDENT 
1 KINASE 6 INHIBITOR; 
CHAIN: A, B; 


MYOTROPHIN; CHAIN; 
NULL 




1 UBIOUITIN 


1 SEQFOLD 
score 




















PMF 
score 




0.37 


0.15 

i 

i 


0.06 


0.10 


0.37 


0.01 




1 0.07 


Verify 
score 




0.01 


IN 

d 


0.14 


CN 
O 


0.01 


0.14 




0.16 1 


Ps! 
Blast 




m 

^. 


in 
6 


r-- 

u 
vq 


CN 


S 
t 

a> 

IN 

rn 


o 

CN 

\o 




O 

ii 

CN 






en 

00 




so 
«n 




a 


r- 

CN 






START 
AA 




o 

»o 




m 


00 




9k 






CHAIN 
ID 




0Q 


< 


< 


< 


< 






< 


Ss 








ON 




lihb 


Imyo 




a 
u 
p 


SEQ ID 
NO 




m 


CN 


CN 




IN 
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PDB annotation 


UBIQUITIN-CONJUGATING 
EN2TME. YEAST 


UBIQUITIN CONJUGATION 
UBCI; UBIQUITIN 
CONJUGATION. LIGASE 




STRUCTURAL PROTEIN 
TWO REPEATS OF 
SPECTRIN, ALPHA HELICAL 
LINKER REGION, 2 2 
TANDEM 3-HELIX COILED- 
COILS. STRUCTURAL 
PROTEIN 


ENDOCYTOSIS/EXOCYTOSl 
S SYNAPTOTAGMIN 
ASSOCIATED 35 KDA 
PROTEIN, P35A. THREE 
HELIX BUNDLE 


TRANSCRIPTION MAX 
DIMERIZATION PROTEIN; 
FOUR-HELIX BUNDLE, 
PROTEIN-PEPTIDE 
COMPLEX 




CONTRACTILE PROTEIN 
TRIPLE-HELIX COILED 
COIL. COmRACTlLE ' 
PROTEIN 




TETRAHYDROBIOPTERIN 
BIOSYNTHESIS 
TETRAHYDROBIOPTERIN 
BIOSYNTHESIS. 
PHOSPHATE ELIMINATION, \ 
2 PTERINE SYNTHESIS 


3- 


STRUCTURAL PROTEIN 1 
TWO REPEATS OF | 
SPECTRIN. ALPHA HELICAL j 


Compound 


CONJUGATING ENZYME; 
CHAIN: A; 


UBIQUITIN 

CONJUGATING ENZYME; 
CHAIN: NULL; 




ALPHA SPECTRIN; CHAIN: 
A.B,C; 


SYNTAXIN-IA; CHAIN: A. 
B, C; 


MADl PROTEIN; CHAIN: 
A; SIN3A; CHAIN: B; 




HUMAN SKELETAL 
MUSCLE ALPHA-AC 1 ININ 
2; CHAIN: A; 




6-PYRUVOYL 


TETRAHYDROPTERIN 
SYNTHASE; CHAIN: A. B; 




ALPHA SPECTRIN; CHAIN: 
A, B.C; 


SEQ FOLD 
score 








57.03 








55.25 








55.09 


PMF 
score 




0.92 






0.49 


0.16 








0.90 






Verify 
score 




-0.44 






-0.01 


0.47 








so 

d 






PsI 
Blast 




oo 
O 

u 




6000 0 


9 

On 


0.006 




NO 
O 




o 




0.00015 






o 








CN 












o 

NO 
(N 


START 
AA 












•A 

m 








NO 




XT 


CHAIN 
ID 








< 


< 


m 




< 




< 




< 


PDB 
ID 




a 
a 

CN 




Icun 


8 
v« 


u 




Iquu 




\o 

JD 




Icun 


SEQ ID 

NO 








in 


*n 






00 

*n 

lO 




^ 
O 




vn 
r- 
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PDB annotation 


LINKER REGION, 2 2 
TANDEM 3-HELIX COILED- 
COILS, STRUCTURAL 
PROTEIN 


CONTRACTILE PROTEIN 
TRIPLE-HELIX COILED 
COIL, CONTRACTILE 
1 PROTEIN 




PHOSPHOLIPID ANALOG 
PLACENTAL 
ANTICOAGULANT 
PROTEIN; PHOSPHOLIPID 
ANALOG, CALCIUM 
BINDING PROTEIN. 
MEMBRANE 2 BINDING 
PROTEIN 




3^ 


PHOSPHOLIPID ANALOG 
PLACENTAL s 
ANTICOAGULANT 
PROTEIN; PHOSPHOLIPID 
ANALOG, CALCIUM ; 
BINDING PROTEIN. 
MEMBRANE 2 BINDING 
PROTEIN 


PHOSPHOLIPID ANALOG 1 
PLACENTAL ' 


Compound 




HUMAN SKELETAL 
MUSCLE ALPHA-ACTININ 
2; CHAIN: A; 




ANNEXIN V; CHAIN: 
NULL; 

1 


CALCIUM/PHOSPHOLIPID 
BINDING ANNEXINV 
(LIPOCORTINV, 
ENDONEXIN II, 
PLACENTAL IHVD 3 
ANTICOAGULANT 
PROTEIN) (CALCIUM IONS 
ARE VISIBLE) MUTATION 
IHVD 4 WITH GLU 17 
REPLACED BY GLY (E17G) 
IHVD 5 




ANNEXIN V; CHAIN: 
NULL; 


ANNEXINV; CHAIN: ' 
NULL; 


SEQ FOLD 
score 




64.26 




51.70 


<N 

\d 








PMF 
score 














1.00 


o 
o 


Verify 
score 














-0.64 


-0.39 


Psl 
Blast 




o 




«n 






? 


o 






s 

CN 










m 
n 




START 

i AA 




OV 
CS 














CHAIN 
ID 




< 














&« 




Iquu 




00 
a 


Ihvd 




cer 
00 
e« 


ex 
oo 
(d 


SEQ ID 
NO 








OS 


s 




O 


O 
m 
^0 
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PDB annotation 


ANTICOAGULANT 
PROTEIN; PHOSPHOLIPID 
ANALOG, CALCIUM 
BINDING PROTEIN. 
MEMBRANE 2 BINDING 
PROTEIN 






TRANSFERASE 
METHYLTRANSFERASE 


METHYLTRANSFERASE 
GNMT. S-ADENOSYL-L- 
METHIONINES GLYCINE 
METHYLTRANSFERASE 




SUGAR BINDING PROTEIN 
NGAL; NEUTROPHIL, NGAL, 
LIPOCALIN f 




SUGAR BINDING PROTEIN ( 
NGAL; NEUTROPHIL 
LIPOCALIN, SIGNAL 


Compound 




CALCIUM/PHOSPHOLIPID 
BINDING ANNEXIN V 
(LIPOCORTIN V, 
ENDONEXINII, 
PLACENTAL IHVD 3 
ANTICOAGULANT 
PROTEIN) (CALCIUM IONS 
ARE VISIBLE) MUTATION 
IHVD 4 WITH GLU17 
REPLACED BY GLY (E17G) 
IHVD 5 




GLYCINE N. 

METHYLTRANSFERASE; 
CHAIN; A. B. C, D; 


GLYCINE N- 

METHYLTRANSFERASE; 
CHAIN: A, B; 




HUMAN NEUTROPHIL 1 
GELATINASE; CHAIN: A, 
B; 


RETINOIC ACID-BINDING 
PROTEIN EPIDIDYMAL 
RETINOIC ACID-BINDING 
PROTEIN I EPA 3 , 
(ANDROGEN DEPENDENT 
SECRETORY PROTEIN) (B- | 
FORM) I EPA 4 


NEUTROPHIL 
GELATINASE; CHAIN: A; 


SEQ FOLD 
score 




















PMF 
score 




0.98 




so 

d 


0.62 




0.16 


0.15 


0.07 


Verify 
score 




-0.71 




-0.19 


-0.15 




-0.14 


-0.06 


0.04 






o 




0.0036 


1 0.0036 




1 0.0032 


o 
o 

!2 


o 
oo 


§^ 




ro 














r* 


START 
AA 




















CHAIN 
ID 








< 


< 




< 


< 


< 






J 






Ixva 




Idfv 


lepa 


o* 


SEQ ID 
NO 




O 
m 

VO 




00 


00 




P; 

p- 




m 
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203 



PDB annotation 


PROTEIN, GLYCOPROTEIN 


SUGAR BINDING PROTEIN 
NGAL; NEUTROPHIL 
UPOCALIN, SIGNAL 
PROTEIN, GLYCOPROTEIN 




GLYCOSIDASE GUS GENE 
PRODUCT; LYSOSOMAL 
ENZYME. ACID 
HYDROLASE, 
GLYCOSIDASE 




TRANSFERASE 
METHYLTRANSFERASE 


TRANSFERASE 
METHYLTRANSFERASE 


STRUCTURAL GENOMICS 
HYPOTHETICAL PROTEIN, 
METHANOCOCCUS 
JANNASCHU 


TRANSFERASE SAM- 
BINDING DOMAIN, BETA- 
BARREL, MIXED ALPHA- 
BETA. HEXAMER, 2 DIMER 


TRANSFERASE SAM- 
BINDING DOMAIN, BETA- 
BARREL, MDCED ALPHA- 
BETA, HEXAMER, 2 DIMER 


TRANSFERASE 

(METHYLTRANSFERASE) 

COMT; TRANSFERASE, 

METHYLTRANSFERASE. 

NEUROTRANSMITTER 

DEGRADATION 


Z 2^ ^ 

< S w 
S 9 z 

32 fc ac 
SOS 


Compound 




NEUTROPHIL 
GELATINASE; CHAIN; A; 




BETA-GLUCURONIDASE; 
. CHAIN: A, B; 




GLYCINE N- 

METHYLTRANSFERASE; 
1 CHAIN: A, B. CD; 


GLYCINE N- 

METHYLTRANSFERASE; 
CHAIN: A, B. C. D; 


MJ0882; CHAIN: A; 


HNRNP ARGININEN- 
METHYLTRANSFERASE; ' 
CHAIN: 1,2,3,4,5.6; | 


HNRNP ARGININEN- I 
METHYLTRANSFERASE; 
CHAIN: 1.2,3,4.5.6; 


CATECHOL O- 
METHYLTRANSFERASE; 
CHAIN: NULL; 


GLYCINE N- 
METHYLTRANSFERASE; 
CHAIN: A, B; 


SEQFOLD 
score 


























PMF 
score 




en 
d 




0.78 




0.58 


; 0.36 


00 
G 


1.00 


1.00 


0.00 


0.65 


Verify 
score 




-0.29 




-0.54 




o 


o 


-0.08 


0.57 


m 
d 


0.12 


o 






0.0019 




rn 
u 




r- 
ri 


fM 

•7 
U 

>o 


le-12 


rs 
eo 
ii 

fM 


lc-63 


fM 
■ 

o\ 


eo 










in 




vo 

CN 


m 

tN 


rsi 


o\ 
o\ 

CM 


o 


fM 

(N 


CM 
*n 


START 
AA 








CM 




OO 


eo 








a 




CHAIN 
ID 








< 




< 


< 


< 








< 


PDB 
ID 




Iqqs 




x: 




<N 
T3 


JZ 

rs 

TJ 


Idus 


«r 
00 


00 


Ivid 


K 


SEQ ID 
NO 




m 




In 




fN 

r* 


(N 

*n 


«n 


CM 
w% 
r* 


S? 

r- 


fM 
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PDB annotation 
MPTH Yf TR ANSFERASE 


METHYLTRANSFERASE 


METHYLTRANSFERASE. 
ERM. ERMAM, MLS 
ANTIBIOTICS, NMR. 2 RRNA 


TRANSFERASE 
METHYLTRANSFERASE 


TRANSFERASE 
METHYLTRANSFERASE 


STRUCTURAL UhNUMlUS 
HYPOTHETICAL PROTEIN, 
METHANOCOCCUS 
JANNASCHU 


TRANSFERASE SAM- 
BINDING DOMAIN. BETA- 
1 BARREL. MIXED ALPHA- 
1 BETA. HEXAMER. 2 DIMER 


TRANSFERASE SAM- 
BINDING DOMAIN. BETA- 
BARREL. MIXED ALPHA- 
BETA. HEXAMER, 2 DIMER 


TRANSFERASE 

(METHYLTRANSFERASE) 

COMT; TRANSFERASE. 

METHYLTRANSFERASE, 

NEUROTRANSMITTER 

DEGRADATION 


TRANSFERASE 

(METHYLTRANSFERASE) 

COMT; TRANSFERASE. 

METHYLTRANSFERASE. 

NEUROTRANSMIllhR 

DEGRADATION 


p 

s < 
[Si 


Compound 


RRNA 

METHYLTRANSFERASE; 
CHAIN: NULL; 


GLYCINE N. 
METHYLTRANSFERASE; 
CHAIN: A, B, C. D; 


GLYCINE N- 

METHYLTRANSFERASE; 
CHAIN: A, B, C. D; 


MJ0882; CHAIN: A; 


HNRNP ARGININE N- 
METHYLTRANSFERASE; 
CHAIN: 1.2.3.4, 5.6; 


HNRNP ARGININE N- 
METHYLTRANSFERASE; 
CHAIN: 1,2, 3,4, 5,6; 


CATECHOL O- 
METHYLTRANSFERASE; 
CHAIN: NULL; 


CATECHOL O- 
METHYLTRANSFERASE; 
CHAIN: NULL; 


GLYCINE N- 
METHYLTRANSFERASE; 


SEQ FOLD 
score 




















PMF 

score 


0.06 


M 

d 


0.36 


0.84 


1.00 


1.00 


0.00 


00 

d 


0.65 


Verify 
score 


O 




i 


-0.05 


0.25 


0.49 


0.12 


-0.08 


o 
o 




1 


so 

i 


«o 




sn 


\a 

OO 


CN 
^* 


! 

Ml 




g< 
§< 






GO 

s 




o 
m 






CM 


r- 


START 
AA 






S3 


00 






O 


CM 


o 


CHAIN 
ID 




< 


< 


< 










< 


A Q 

£- 


lyub 




^ 




to 


o 
eo 


Ivid 




Ixva 


SEQ ID 
NO 




m 


m 
r- 


m 
m 


ro 
m 
r* 


m 
•n 


m 


m 


»n 
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PDB annotation 


METHIONINES GLYCINE 
METHYLTRANSFERASE 


METHYLTRANSFERASt 
ERMAM; 

METHYLTRANSFERASE. 
ERM, ERMAM, MLS 
ANTIBIOTICS. NMR, 2 RRNA 


ENDOCYTOSIS/EXOCYTOSI 
S SYNAPTCTTAGMIN 
ASSOCIATED 35 KDA 
PROTEIN. P35A, THREE 
HELIX BUNDLE 


ENDOCYTOSIS/EXOCYTOSI 
S NSECl; PROTEIN-PROTEIN 
COMPLEX, MULTI-SUBUNIT 


TRANSFERASE HRS; HKSi, 
VHS, FYVE, ZINC FINGER. 
SUPERHELDC 


SIGNALINU PROltlN UBr, 
GTP HYDROLYSIS. GDP, 
GMP, INI ERFERON 
INDUCED. DYNAMIN 2 
RELATED, LARGE GTPASE 
FAMILY. GMPPNP. GPPNHP. 


STRUCTURA L PROTEIN i 
TWO REPEATS OF | 
SPECTRIN, ALPHA HELICAL j 
LINKER REGION. 2 2 
TANDEM 3-HELIX COILED- . 
COILS, STRUCl URAL 
PROTEIN 


COMPLEX 

(INHIBITOR/NUCLEASE) 1 
COMPLEX ' 
fINHIBITOR/NUCLEASE). 


Compound 


CHAIN: A, B; 


RRNA 

METHYLTRANSFERASE; 
CHAIN: NULL; 


SYNTAXIN-1 A; CHAIN: A, 
B, C; 


SYNTAXIN BINDING 
PROTEIN 1; CHAIN: A; 
SYNTAXIN lA: CHAIN: B; 


HEPATOCYTE GROWTH 
FACTOR-REGULATED 
TYROSINE CHAIN: A; 


INTERFERON-INDUCED 
GUANYLATE-BINDING 
PROTEIN 1; CHAIN: A; 


ALPHA SPECTRIN; CHAIN: 
A, B,C; 


RIBONUCLEASE 
INHIBITOR; CHAIN: A, D; 
ANGIOGENIN; CHAIN: B, 


SEQ FOLD 
score 
















90.38 


PMF 
score 




0.0 1 


n 
d 


0.25 


1 0.15 


o 
d 


0.21 




Verify 
score 




so 
"A 
o 


0.24 


-0.42 


o 


o 

CN 

9 


-0.09 

1 




Psi 
Blast 




NO 

1 


o 

v 


NO 

o 

O 

\a 


0.006 


•o 

VO 


0.006 


i> 






3 


% 


CM 
? 


S 


SO 
rs 


m 
o 

fS 


C4 
00 


START 
AA 






00 






00 


m 
oo 




CHAIN 

in 






< 


n 


< 


< 


< 


< 


PDB 1 
tn 




lyub 




Idnl 
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>> 
rr 

C9 


SEQ ID 1 


; 


m 
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00 
00 
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PDB annotation 


COMPLEX (Rl-ANG). 
HYDROLASE 2 MOLECULAR 
RECOGNITION, EPITOPE 
MAPPING. LEUCINE-RICH 3 
REPEATS 


ACETYLATION RNASE 
INHIBITOR, 

RIBONUCLEASE/ANGIOGEN 
IN INHIBITOR 
ACETYLATION. LEUCINE- 
RICH REPEATS 




SIGNALING 

PROTEIN/TRANSFERASE 
NAK; COMPLEX, SIGNAL 
TRANSDUCTION, 
PHOSPHOTYROSINE 
BINDING 2 DOMAIN (PTE), 
ASYMMETRIC CELL 
DIVISION 




COMPLEX 

{TRANSDUCERA^RANSDUCT 
ION) GT BETA-GAMMA; 
MEKA, PP33 PHOSDUCIN. 
TRANSDUCIN, BETA- 
GAMMA. SIGNAL 
TRANSDUCTION. 2 
REGULATION, 
PHOSPHORYLATION, G 
PROTEINS, THIOREDOXIN, 3 , 
VISION, MEKA. COMPLEX 
(TRANSDUCERn-RANSDUCT 
ION), 4 POST- 
TRANSLATIONAL 
MODIFICATION. FARNESYL. 
FARNESYLATION HEADER 
HETNAM 

SIGNALING PROTEIN GT 


Compound 




RIBONUCLEASE 
INHIBITOR; CHAIN; NULL; 




NUMB PROTEIN; CHAIN: 
A; NUMB ASSOCIATE 
KINASE; CHAIN: B; 




TRANSDUCIN; CHAIN: B. 
G; PHOSDUCIN; CHAIN: P; 

TRANSDUCIN; CHAIN: A; 


SEQ FOLD 
score 




89.16 










PMF 
score 








1 0.71 






Verify 
score 








0.30 






Psi 
Blast 




rn 




m 
o 






^< 




o> 




S 




S 5 


START 
AA 
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ID 
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ID 
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PDB annotation 


BETA; GT GAMMA; MEKA, 
PP33; PHOSDUCIN, 
TRANSDUCIN, BETA- 
GAMMA. SIGNAL 
TRANSDUCTION, 2 
REGULATION, 
PHOSPHORYLATION, G 
PROTEINS, THIOREDOXIN, 3 
, VISION, MEKA, COMPLEX 
(TRANSDUCER/ 
TRANSDUCTION), 
SIGNALING 4 PROTEIN 


COMPLEX 

(TRANSDUCER/TRANSDUCT 
ION) GT BETA-GAMMA; 
MEKA, PP33; PHOSDUCIN, 
TRANSDUCIN. BETA- 
GAMMA, SIGNAL 
TRANSDUCTION. 2 
REGULATION, 
PHOSPHORYLATION, G 
PROTEINS, THIOREDOXIN, 3 
VISION, MEKA, COMPLEX 
(TRANSDUCER/TRANSDUCT 
ION) 




TRANSPORT PROTEIN RHU- 
GTPASE EXCHANGE 
FACTOR, TRANSPORT 
PROTEIN 




Compound 


TRANSDUCIN; CHAIN: B; 
PHOSDUCIN; CHAIN: C; 


TRANSDUCIN; CHAIN: B, 
G; PHOSDUCIN; CHAIN: P; 




PIX; CHAIN: A; 




SEQ FOLD 
score 




63.15 




51.70 




PMF 
score 












Verify 
score 












Psi 
Blast 




«M 
1 

U 
00 

\d 




9900*0 




W ^ 




o 

CM 




o 
o 

CM 




START 
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OO 
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ID 
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CM 




Ibyl , 
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oo 
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Table 6 



SEQ.ID NO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


530 


23 


0.923 


0.602 


569 


23 


0.923 


0.602 


961 


23 


0.923 


0.602 
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Table? 





r^hrnmcfkmsil In^ntinn 
v^iir uiiDuiiicii lu^Miiuit 


1 
1 


1 V 


L 


X'KqIX 1-51 1 


i 




4 


1 1 
1 1 


0 




10 


O 


1 1 


1 
1 


13 


1 / 


1 A 

14 




IS 


14 


16 




17 


lOpl J.P 


18 


1 icen-qiz.i 


19 


1 /pi J 


20 


13 


21 


0 


22 


yqJ4.1 J-<ii4o 


24 


->(J14 


25 


1 / 


26 




27 


z 


29 


1Q 


30 


1 1 


31 




32 


15ql5-q21.1 


34 




35 




36 




37 


V 


38 


oq22-C|Z3 


39 


/ 


41 


0 


46 


I J 


ill 
47 


I 


48 


\ 
J 


49 


7 


53 




5.5 


in 
lU 


C.£. 

56 


llpl3 


57 


1 7 


59 


10 




Xq28 


61 


7 


62 


20 


63 


20 


64 


6 


65 


16 


66 


14 


67 


8 


68 


lptcr-pl2 


71 


22 


73 


Ip36.3-p36.2 
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SEQ ID NO: 


Chromsomal location 


74 


15q22 


75 


I5q22 


76 


12 


77 


15 


79 


22ql3.1 


80 


16 


81 


16 


82 


2ql3 


83 


10 


84 


5 


86 


17 


87 


16q24.3 


89 


X 


90 


10 
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CLAIMS 

WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from 
the group consisting of SEQ ID NO: 1 - 526, a mature protein coding portion of SEQ 

5 ID NO; 1 - 526, an active domain coding protein of SEQ ID NO: 1 - 526, and 
complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 

1 0 polynucleotide of claim 1 . 

3. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

4. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises 
1 5 the complementary sequences. 

5 . A vector comprising the polynucleotide of claim I . 

6. An expression vector comprising the polynucleotide of claim I . 

20 

7. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 

25 polynucleotide in the host cell. 

9. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of a polypeptide encoded by any one of the polynucleotides of claim 1 i.e. 
SEQ ID NO: 527-1052). 

30 

10. A composition comprising the polypeptide of claim 9 and a carrier. 
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11. An antibody directed against the polypeptide of claim 9. 

12. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms 
5 a complex with the polynucleotide of claim 1 for a period sufficient to form the 

complex; and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

10 13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
1 5 polynucleotide of claim 1 ; and 

c) detecting said product and thereby the polynucleotide of claim 

1 in &e sample. 

14. The method of claim 13, wherein the polynucleotide is an RNA molecule and 
20 the method further comprises reverse transcribing an annealed RNA molecule into a 

cDNA polynucleotide. 

1 5. A method for detecting the polypeptide of claim 9 in a sample, comprising: 

a) contacting the sample with a compound fliat binds to and forms 
25 a complex with the polypeptide under conditions and for a period sufficient to form 

the complex; and 

b) detecting formation of the complex, so that if a complex 
formation is detected, the polypeptide of claim 9 is detected. 

30 1 6. A method for identifying a compound that binds to the polypeptide of claim 9, 
comprising: 
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a) contacting the compound with the polypeptide of claim 9 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 9 is 

5 identified. 

17, A method for identifying a compound that binds to the polypeptide of claim 9, 
comprising: 

a) contacting the compound with the polypeptide of claim 9, in a 
10 cell, under conditions sufficient to form a polypeptide/compound complex, wherein 

the complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound 
that binds to the polypeptide of claim 9 is identified. 

15 

1 8. A method of producing the polypeptide of claim 9, comprising, 

a) culturing a host cell comprising a polynucleotide sequence 
selected from the group consisting of a polynucleotide sequence of SEQ ID NO: 1- 
526, a mature protein coding portion of SEQ ID NO: 1-526, an active domain coding 

20 portion of SEQ ID NO: 1-526, complementary sequences thereof, under conditions 
sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step 

(a). 

25 19. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides from the Sequence Listing, the 
mature protein portion thereof, or the active domain thereof 

20. The polypeptide of claim 21 wherein the polypeptide is provided on a 
30 polypeptide array. 
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21. A collection of polynucleotides, wherein the collection comprising the 
sequence information of at least one of SEQ ID NO: 1 - 526. 

22. The collection of claim 21, wherein the collection is provided on a nucleic 
S acid array. 

23. The collection of claim 22, wherein the array detects full-matches to any one 
of the polynucleotides in the collection. 

10 24. The collection of claim 22, wherein the array detects mismatches to any one 
of the polynucleotides in the collection. 

25. The collection of claim 21, wherein the collection is provided in a computer- 
readable format. 

15 

26. A method of treatment comprising administering to a mammalian subject in 
need thereof a therapeutic amount of a composition comprising a polypeptide of 
claim 9 or 19 and a pharmaceutically acceptable carrier. 

20 27. A method of treatment comprising administering to a mammalian subject in 
need thereof a therapeutic amount of a composition comprising an antibody that 
specifically binds to a polypeptide of claim 9 or 1 9 and a pharmaceutically acceptable 
carrier. 
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ABG99910; 

17-JAN-2003 (first entry) 
Human novel polypeptide #23. 

Human; genetic disorder; gene mapping; medical imaging; cancer; 
neurodegenerative disorder; lymphoid cell disorder; osteoporosis; 
Parkinson's disease; Alzheimer's disease; bone degenerative disorder; 
osteoarthritis; periodontal disease; liver fibrosis; viral infection; 
fungal infection; bacterial infection; autoimmune disease; diabetes ; 
atopic dermatitis . 

Homo sapiens. 
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XX 

PP 14-MAR-2002; 2002WO-US005109 . 

XX 

PR lS-MAR-2001; 20010S - 00810173 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Zhou P, Goodrich R, Asundi V, Zhang J, Zhao QA, Ren P; 

PI Xue AJ, Yang Y. Ma Y, Yamazaki V, Chen R, Wang Z, Ghosh M; 

PI Wehrman T, Wang J, Wang D, Drmanac RT; 
XX 

DR WPI; 2003-040556/03. 

DR N-PSDB; ABX05008. 
XX 

PT New isolated polypeptides and polynucleotides, useful for preventing, 

PT treating or ameliorating medical conditions, such as cancer, 

PT neurodegenerative disorders, lymphoid cell disorders, bone degenerative 

PT disorders, and infections. 

XX 

PS Claim 9; SEQ ID NO 54 9; 23 5pp; English. 
XX 

CC The invention relates to human polynucleotides and the polypeptides they 

CC encode. The polynucleotides and polypeptides are useful in diagnostics, 

CC forensics, gene mapping, medical imaging, identification of mutations 

CC responsible for genetic disorders or other traits, assessing biodiversity 

CC and producing many other types of data and products dependent on DNA and 

CC amino acid sequences. They are also useful for preventing, treating or 

CC ameliorating medical conditions, such as cancer, neurodegenerative 

CC disorders (e.g. Parkinson's disease. Alzheimer's disease), lymphoid cell 

CC disorders, osteoporosis, osteoarthritis, bone degenerative disorders, 

CC periodontal disease, liver fibrosis, infections (e.g. viral, fungal or 

CC bacterial) or autoimmune diseases (e.g. diabetes, atopic dermatitis). 

CC Sequences ABG99888- ABG99989 and ABUOOOlO - ABU004 33 represent human 

CC polypeptides of the invention. Note: The sequence data for this patent is 

CC not represented in the printed specification but is based on sequence 
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CC information supplied by the European Patent Office 
XX 

SQ Sequence 313 AA; 

Query Match 100.0%; Score 1318; DB 6; Length 313; 

Best Local Similarity 100.0%; Pred. No. 3.4e-138; 

Matches 254; Conservative 0; Mismatchea 0; Indels 0; 



Gaps 



Qy 

Db 

oy 

Db 

Qy 

Db 

oy 

Db 



1 PHVRSFHHHFHTCRVQGAWPLLDNDFLFVQATSSPMALGANATATRKLTIIFKNMQECID 60 

iiiiitiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 
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lltllllllilllllilllltlllllltiltlllllllllllllllillillllllllli 
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New human molecules for disease detection and treatment (MDDT) , useful 
for diagnosing, treating and preventing diseases or conditions associated 
with the aberrant mddt expression e.g. cancer, AIDS, atherosclerosis, 
epilepsy, qr infections. 
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The invention relates to a novel isolated human MDDT (molecule for 
disease /detection and treatment) polypeptide. The polypeptide of the 
invention demonstrates cytostatic, antiarteriosclerotic, anticonvulsant, 
nootropic, neuroprotective, cerebroprotective, anti-HIV, antiallergic, 
antiinlclammatory and thyromimetic activities and may be useful for 
diagnosing, treating and preventing a variety of diseases including cell 
prolifferative diseases such as cancer and atherosclerosis, neurological 
diseases, in particular epilepsy, Huntington's disease and stroke, immune 
or inflammatory diseases including AIDS and allergies and developmental 
disorders including hypothyroidism and Cushing's syndrome, as well as 
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